2022-11-23T01:10:29.4859142Z Requested labels: linux.8xlarge.nvidia.gpu 2022-11-23T01:10:29.4859244Z Job defined at: pytorch/pytorch/.github/workflows/_linux-test.yml@refs/heads/master 2022-11-23T01:10:29.4859283Z Waiting for a runner to pick up this job... 2022-11-23T01:10:29.6321208Z Job is about to start running on the runner: i-08478b31fddc5d09b (organization) 2022-11-23T01:10:34.4159335Z Current runner version: '2.299.1' 2022-11-23T01:10:34.4166946Z Runner name: 'i-08478b31fddc5d09b' 2022-11-23T01:10:34.4167645Z Runner group name: 'Default' 2022-11-23T01:10:34.4168505Z Machine name: 'ip-10-0-4-85' 2022-11-23T01:10:34.4171167Z ##[group]GITHUB_TOKEN Permissions 2022-11-23T01:10:34.4172211Z Actions: write 2022-11-23T01:10:34.4172612Z Checks: write 2022-11-23T01:10:34.4173032Z Contents: write 2022-11-23T01:10:34.4173470Z Deployments: write 2022-11-23T01:10:34.4173857Z Discussions: write 2022-11-23T01:10:34.4174340Z Issues: write 2022-11-23T01:10:34.4174753Z Metadata: read 2022-11-23T01:10:34.4175141Z Packages: write 2022-11-23T01:10:34.4175595Z Pages: write 2022-11-23T01:10:34.4176097Z PullRequests: write 2022-11-23T01:10:34.4176538Z RepositoryProjects: write 2022-11-23T01:10:34.4177011Z SecurityEvents: write 2022-11-23T01:10:34.4177450Z Statuses: write 2022-11-23T01:10:34.4177826Z ##[endgroup] 2022-11-23T01:10:34.4182054Z Secret source: Actions 2022-11-23T01:10:34.4182808Z Prepare workflow directory 2022-11-23T01:10:34.5465779Z Prepare all required actions 2022-11-23T01:10:34.5688524Z Getting action download info 2022-11-23T01:10:34.7636033Z Download action repository 'pytorch/test-infra@main' (SHA:c57ff4d9a93667a5571a80a0e92c3e2674aeedfd) 2022-11-23T01:10:35.1011879Z Download action repository 'pytorch/pytorch@master' (SHA:1cfd3858ac54fe3883534309081631a0a892ba3f) 2022-11-23T01:10:38.3393250Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2022-11-23T01:10:38.6389662Z Getting action download info 2022-11-23T01:10:38.7901314Z Download action repository 'malfet/checkout@silent-checkout' (SHA:c7b8fef48edfe1bca0044a44b1f7f7c4318a3076) 2022-11-23T01:10:38.9728699Z Getting action download info 2022-11-23T01:10:39.1573175Z Download action repository 'nick-fields/retry@7d4a37704547a311dbb66ebdf5b23ec19374a767' (SHA:7d4a37704547a311dbb66ebdf5b23ec19374a767) 2022-11-23T01:10:39.2817586Z Uses: pytorch/pytorch/.github/workflows/_linux-test.yml 2022-11-23T01:10:39.2819993Z ##[group] Inputs 2022-11-23T01:10:39.2820375Z build-environment: linux-bionic-cuda11.6-py3.10-gcc7 2022-11-23T01:10:39.2821681Z test-matrix: { include: [ { config: "default", shard: 1, num_shards: 4, runner: "linux.4xlarge.nvidia.gpu" }, { config: "default", shard: 2, num_shards: 4, runner: "linux.4xlarge.nvidia.gpu" }, { config: "default", shard: 3, num_shards: 4, runner: "linux.4xlarge.nvidia.gpu" }, { config: "default", shard: 4, num_shards: 4, runner: "linux.4xlarge.nvidia.gpu" }, { config: "distributed", shard: 1, num_shards: 3, runner: "linux.8xlarge.nvidia.gpu" }, { config: "distributed", shard: 2, num_shards: 3, runner: "linux.8xlarge.nvidia.gpu" }, { config: "distributed", shard: 3, num_shards: 3, runner: "linux.8xlarge.nvidia.gpu" }, { config: "functorch", shard: 1, num_shards: 1, runner: "linux.4xlarge.nvidia.gpu" }, { config: "deploy", shard: 1, num_shards: 1, runner: "linux.4xlarge.nvidia.gpu" }, ]} 2022-11-23T01:10:39.2823072Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:10:39.2823565Z sync-tag: 2022-11-23T01:10:39.2824625Z timeout-minutes: 240 2022-11-23T01:10:39.2824918Z ##[endgroup] 2022-11-23T01:10:39.2825764Z Complete job name: linux-bionic-cuda11.6-py3.10-gcc7 / test (distributed, 3, 3, linux.8xlarge.nvidia.gpu) 2022-11-23T01:10:39.3895796Z ##[group]Run pytorch/test-infra/.github/actions/setup-ssh@main 2022-11-23T01:10:39.3896203Z with: 2022-11-23T01:10:39.3896796Z github-secret: *** 2022-11-23T01:10:39.3897105Z activate-with-label: false 2022-11-23T01:10:39.3897359Z label: with-ssh 2022-11-23T01:10:39.3897628Z remove-existing-keys: true 2022-11-23T01:10:39.3897893Z env: 2022-11-23T01:10:39.3898117Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:10:39.3898382Z ##[endgroup] 2022-11-23T01:10:39.4949724Z Not on pull request and ciflow reference could not be extracted, skipping adding ssh keys 2022-11-23T01:10:39.5171110Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@master 2022-11-23T01:10:39.5171488Z with: 2022-11-23T01:10:39.5171716Z submodules: recursive 2022-11-23T01:10:39.5171979Z fetch-depth: 0 2022-11-23T01:10:39.5172209Z env: 2022-11-23T01:10:39.5172447Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:10:39.5172748Z ##[endgroup] 2022-11-23T01:10:39.5455601Z ##[group]Run retry () { 2022-11-23T01:10:39.5455932Z retry () { 2022-11-23T01:10:39.5456229Z  $* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*) 2022-11-23T01:10:39.5456525Z } 2022-11-23T01:10:39.5456788Z echo "${GITHUB_WORKSPACE}" 2022-11-23T01:10:39.5457072Z if [ -z "${NO_SUDO}" ]; then 2022-11-23T01:10:39.5457387Z  retry sudo rm -rf "${GITHUB_WORKSPACE}" 2022-11-23T01:10:39.5457671Z else 2022-11-23T01:10:39.5457932Z  retry rm -rf "${GITHUB_WORKSPACE}" 2022-11-23T01:10:39.5458205Z fi 2022-11-23T01:10:39.5458502Z mkdir "${GITHUB_WORKSPACE}" 2022-11-23T01:10:39.5476632Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:10:39.5476947Z env: 2022-11-23T01:10:39.5477201Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:10:39.5477463Z NO_SUDO: 2022-11-23T01:10:39.5477684Z ##[endgroup] 2022-11-23T01:10:39.5601953Z /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-11-23T01:10:39.5960388Z ##[group]Run malfet/checkout@silent-checkout 2022-11-23T01:10:39.5960688Z with: 2022-11-23T01:10:39.5960962Z ref: 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:10:39.5961229Z fetch-depth: 0 2022-11-23T01:10:39.5961483Z submodules: recursive 2022-11-23T01:10:39.5961746Z quiet-checkout: true 2022-11-23T01:10:39.5962006Z repository: pytorch/pytorch 2022-11-23T01:10:39.5962474Z token: *** 2022-11-23T01:10:39.5962727Z ssh-strict: true 2022-11-23T01:10:39.5962981Z persist-credentials: true 2022-11-23T01:10:39.5963246Z clean: true 2022-11-23T01:10:39.5963482Z lfs: false 2022-11-23T01:10:39.5963731Z set-safe-directory: true 2022-11-23T01:10:39.5963981Z env: 2022-11-23T01:10:39.5964219Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:10:39.5964464Z ##[endgroup] 2022-11-23T01:10:39.7464766Z Syncing repository: pytorch/pytorch 2022-11-23T01:10:39.7466627Z ##[group]Getting Git version info 2022-11-23T01:10:39.7467163Z Working directory is '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2022-11-23T01:10:39.7467778Z [command]/usr/bin/git version 2022-11-23T01:10:39.7468053Z git version 2.37.1 2022-11-23T01:10:39.7479398Z ##[endgroup] 2022-11-23T01:10:39.7499582Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/9ef85718-2ac5-4de8-bb3d-f420c1c3e07c' before making global git config changes 2022-11-23T01:10:39.7500808Z Adding repository directory to the temporary git global config as a safe directory 2022-11-23T01:10:39.7506985Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-11-23T01:10:39.7552578Z Deleting the contents of '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2022-11-23T01:10:39.7559395Z ##[group]Initializing the repository 2022-11-23T01:10:39.7563142Z [command]/usr/bin/git init /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-11-23T01:10:39.7594431Z hint: Using 'master' as the name for the initial branch. This default branch name 2022-11-23T01:10:39.7594908Z hint: is subject to change. To configure the initial branch name to use in all 2022-11-23T01:10:39.7595347Z hint: of your new repositories, which will suppress this warning, call: 2022-11-23T01:10:39.7595671Z hint: 2022-11-23T01:10:39.7596031Z hint: git config --global init.defaultBranch 2022-11-23T01:10:39.7596330Z hint: 2022-11-23T01:10:39.7596722Z hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and 2022-11-23T01:10:39.7597212Z hint: 'development'. The just-created branch can be renamed via this command: 2022-11-23T01:10:39.7597540Z hint: 2022-11-23T01:10:39.7597988Z hint: git branch -m 2022-11-23T01:10:39.7598611Z Initialized empty Git repository in /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/ 2022-11-23T01:10:39.7608603Z [command]/usr/bin/git remote add origin https://github.com/pytorch/pytorch 2022-11-23T01:10:39.7643337Z ##[endgroup] 2022-11-23T01:10:39.7643818Z ##[group]Disabling automatic garbage collection 2022-11-23T01:10:39.7648096Z [command]/usr/bin/git config --local gc.auto 0 2022-11-23T01:10:39.7678867Z ##[endgroup] 2022-11-23T01:10:39.7679918Z ##[group]Setting up auth 2022-11-23T01:10:39.7689255Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2022-11-23T01:10:39.7723632Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || : 2022-11-23T01:10:39.8076227Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2022-11-23T01:10:39.8106120Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || : 2022-11-23T01:10:39.8397452Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2022-11-23T01:10:39.8444517Z ##[endgroup] 2022-11-23T01:10:39.8444989Z ##[group]Fetching the repository 2022-11-23T01:10:39.8453530Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --quiet --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2022-11-23T01:11:32.2347759Z [command]/usr/bin/git rev-parse --verify --quiet 1cfd3858ac54fe3883534309081631a0a892ba3f^{object} 2022-11-23T01:11:32.2375496Z 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:11:32.2383600Z ##[endgroup] 2022-11-23T01:11:32.2384418Z ##[group]Determining the checkout info 2022-11-23T01:11:32.2385369Z ##[endgroup] 2022-11-23T01:11:32.2386460Z ##[group]Checking out the ref 2022-11-23T01:11:32.2388589Z [command]/usr/bin/git checkout --quiet --force 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:11:33.9623596Z ##[endgroup] 2022-11-23T01:11:33.9624460Z ##[group]Setting up auth for fetching submodules 2022-11-23T01:11:33.9631519Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2022-11-23T01:11:33.9685771Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2022-11-23T01:11:33.9718132Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2022-11-23T01:11:33.9749178Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2022-11-23T01:11:33.9779362Z ##[endgroup] 2022-11-23T01:11:33.9779842Z ##[group]Fetching submodules 2022-11-23T01:11:33.9784951Z [command]/usr/bin/git submodule sync --recursive 2022-11-23T01:11:34.0108481Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2022-11-23T01:11:34.0411761Z Submodule 'android/libs/fbjni' (https://github.com/facebookincubator/fbjni.git) registered for path 'android/libs/fbjni' 2022-11-23T01:11:34.0414496Z Submodule 'third_party/NNPACK_deps/FP16' (https://github.com/Maratyszcza/FP16.git) registered for path 'third_party/FP16' 2022-11-23T01:11:34.0417440Z Submodule 'third_party/NNPACK_deps/FXdiv' (https://github.com/Maratyszcza/FXdiv.git) registered for path 'third_party/FXdiv' 2022-11-23T01:11:34.0420823Z Submodule 'third_party/NNPACK' (https://github.com/Maratyszcza/NNPACK.git) registered for path 'third_party/NNPACK' 2022-11-23T01:11:34.0424592Z Submodule 'third_party/QNNPACK' (https://github.com/pytorch/QNNPACK) registered for path 'third_party/QNNPACK' 2022-11-23T01:11:34.0428305Z Submodule 'third_party/VulkanMemoryAllocator' (https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.git) registered for path 'third_party/VulkanMemoryAllocator' 2022-11-23T01:11:34.0432241Z Submodule 'third_party/XNNPACK' (https://github.com/google/XNNPACK.git) registered for path 'third_party/XNNPACK' 2022-11-23T01:11:34.0435896Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/benchmark' 2022-11-23T01:11:34.0439304Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo.git) registered for path 'third_party/cpuinfo' 2022-11-23T01:11:34.0442939Z Submodule 'third_party/cub' (https://github.com/NVlabs/cub.git) registered for path 'third_party/cub' 2022-11-23T01:11:34.0447048Z Submodule 'third_party/cudnn_frontend' (https://github.com/NVIDIA/cudnn-frontend.git) registered for path 'third_party/cudnn_frontend' 2022-11-23T01:11:34.0451093Z Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/cutlass' 2022-11-23T01:11:34.0455500Z Submodule 'third_party/eigen' (https://gitlab.com/libeigen/eigen.git) registered for path 'third_party/eigen' 2022-11-23T01:11:34.0459961Z Submodule 'third_party/fbgemm' (https://github.com/pytorch/fbgemm) registered for path 'third_party/fbgemm' 2022-11-23T01:11:34.0464443Z Submodule 'third_party/flatbuffers' (https://github.com/google/flatbuffers.git) registered for path 'third_party/flatbuffers' 2022-11-23T01:11:34.0469438Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/fmt' 2022-11-23T01:11:34.0474613Z Submodule 'third_party/foxi' (https://github.com/houseroad/foxi.git) registered for path 'third_party/foxi' 2022-11-23T01:11:34.0479533Z Submodule 'third_party/gemmlowp/gemmlowp' (https://github.com/google/gemmlowp.git) registered for path 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:11:34.0484648Z Submodule 'third_party/gloo' (https://github.com/facebookincubator/gloo) registered for path 'third_party/gloo' 2022-11-23T01:11:34.0489943Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/googletest' 2022-11-23T01:11:34.0495473Z Submodule 'third_party/ideep' (https://github.com/intel/ideep) registered for path 'third_party/ideep' 2022-11-23T01:11:34.0500884Z Submodule 'third_party/ios-cmake' (https://github.com/Yangqing/ios-cmake.git) registered for path 'third_party/ios-cmake' 2022-11-23T01:11:34.0506510Z Submodule 'third_party/ittapi' (https://github.com/intel/ittapi.git) registered for path 'third_party/ittapi' 2022-11-23T01:11:34.0513307Z Submodule 'third_party/kineto' (https://github.com/pytorch/kineto) registered for path 'third_party/kineto' 2022-11-23T01:11:34.0518777Z Submodule 'third_party/nccl/nccl' (https://github.com/NVIDIA/nccl) registered for path 'third_party/nccl/nccl' 2022-11-23T01:11:34.0524744Z Submodule 'third_party/neon2sse' (https://github.com/intel/ARM_NEON_2_x86_SSE.git) registered for path 'third_party/neon2sse' 2022-11-23T01:11:34.0530892Z Submodule 'third_party/nlohmann' (https://github.com/nlohmann/json.git) registered for path 'third_party/nlohmann' 2022-11-23T01:11:34.0537125Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx' 2022-11-23T01:11:34.0543542Z Submodule 'third_party/onnx-tensorrt' (https://github.com/onnx/onnx-tensorrt) registered for path 'third_party/onnx-tensorrt' 2022-11-23T01:11:34.0551004Z Submodule 'third_party/pocketfft' (https://github.com/mreineck/pocketfft) registered for path 'third_party/pocketfft' 2022-11-23T01:11:34.0558709Z Submodule 'third_party/protobuf' (https://github.com/protocolbuffers/protobuf.git) registered for path 'third_party/protobuf' 2022-11-23T01:11:34.0565339Z Submodule 'third_party/NNPACK_deps/psimd' (https://github.com/Maratyszcza/psimd.git) registered for path 'third_party/psimd' 2022-11-23T01:11:34.0572294Z Submodule 'third_party/NNPACK_deps/pthreadpool' (https://github.com/Maratyszcza/pthreadpool.git) registered for path 'third_party/pthreadpool' 2022-11-23T01:11:34.0579255Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/pybind11' 2022-11-23T01:11:34.0586488Z Submodule 'third_party/python-enum' (https://github.com/PeachPy/enum34.git) registered for path 'third_party/python-enum' 2022-11-23T01:11:34.0594378Z Submodule 'third_party/python-peachpy' (https://github.com/malfet/PeachPy.git) registered for path 'third_party/python-peachpy' 2022-11-23T01:11:34.0601538Z Submodule 'third_party/python-six' (https://github.com/benjaminp/six.git) registered for path 'third_party/python-six' 2022-11-23T01:11:34.0608973Z Submodule 'third_party/sleef' (https://github.com/shibatch/sleef) registered for path 'third_party/sleef' 2022-11-23T01:11:34.0616625Z Submodule 'third_party/tbb' (https://github.com/01org/tbb) registered for path 'third_party/tbb' 2022-11-23T01:11:34.0624601Z Submodule 'third_party/tensorpipe' (https://github.com/pytorch/tensorpipe.git) registered for path 'third_party/tensorpipe' 2022-11-23T01:11:34.0633657Z Submodule 'third_party/zstd' (https://github.com/facebook/zstd.git) registered for path 'third_party/zstd' 2022-11-23T01:11:34.0660844Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/android/libs/fbjni'... 2022-11-23T01:11:34.3541875Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FP16'... 2022-11-23T01:11:34.5592520Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FXdiv'... 2022-11-23T01:11:34.7684239Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/NNPACK'... 2022-11-23T01:11:35.0712876Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/QNNPACK'... 2022-11-23T01:11:35.3347804Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/VulkanMemoryAllocator'... 2022-11-23T01:11:37.5415769Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/XNNPACK'... 2022-11-23T01:11:43.3271353Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/benchmark'... 2022-11-23T01:11:43.7880865Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cpuinfo'... 2022-11-23T01:11:44.3483460Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cub'... 2022-11-23T01:11:46.0648462Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cudnn_frontend'... 2022-11-23T01:11:47.5819013Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cutlass'... 2022-11-23T01:11:49.2474776Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/eigen'... 2022-11-23T01:11:56.3074765Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm'... 2022-11-23T01:11:57.0787191Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flatbuffers'... 2022-11-23T01:11:58.7083160Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fmt'... 2022-11-23T01:11:59.8297629Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/foxi'... 2022-11-23T01:12:00.0386029Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gemmlowp/gemmlowp'... 2022-11-23T01:12:00.5509574Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gloo'... 2022-11-23T01:12:00.9372816Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/googletest'... 2022-11-23T01:12:01.9977207Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep'... 2022-11-23T01:12:02.4677563Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ios-cmake'... 2022-11-23T01:12:02.7226181Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ittapi'... 2022-11-23T01:12:03.0131485Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto'... 2022-11-23T01:12:05.3629825Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nccl/nccl'... 2022-11-23T01:12:05.8721408Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/neon2sse'... 2022-11-23T01:12:06.2598354Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nlohmann'... 2022-11-23T01:12:12.7844960Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx'... 2022-11-23T01:12:14.4111224Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt'... 2022-11-23T01:12:14.8650339Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pocketfft'... 2022-11-23T01:12:15.1055455Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf'... 2022-11-23T01:12:21.4373132Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/psimd'... 2022-11-23T01:12:21.6438391Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pthreadpool'... 2022-11-23T01:12:21.8702492Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pybind11'... 2022-11-23T01:12:22.7493459Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-enum'... 2022-11-23T01:12:23.0048199Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-peachpy'... 2022-11-23T01:12:23.3406394Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-six'... 2022-11-23T01:12:23.6560714Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/sleef'... 2022-11-23T01:12:24.2368301Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tbb'... 2022-11-23T01:12:26.7934614Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe'... 2022-11-23T01:12:27.3026973Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/zstd'... 2022-11-23T01:12:29.6699034Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2022-11-23T01:12:29.6824399Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2022-11-23T01:12:29.6920996Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2022-11-23T01:12:29.7205785Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2022-11-23T01:12:29.7472121Z Submodule path 'third_party/QNNPACK': checked out '7d2a4e9931a82adc3814275b6219a03e24e36b4c' 2022-11-23T01:12:29.7897940Z Submodule path 'third_party/VulkanMemoryAllocator': checked out 'a6bfc237255a6bac1513f7c1ebde6d8aed6b5191' 2022-11-23T01:12:30.5427147Z Submodule path 'third_party/XNNPACK': checked out 'ae108ef49aa5623b896fc93d4298c49d1750d9ba' 2022-11-23T01:12:30.5675386Z Submodule path 'third_party/benchmark': checked out '0d98dba29d66e93259db7daa53a9327df767a415' 2022-11-23T01:12:30.6885341Z Submodule path 'third_party/cpuinfo': checked out '8ec7bd91ad0470e61cf38f618cc1f270dede599c' 2022-11-23T01:12:30.7293026Z Submodule path 'third_party/cub': checked out 'd106ddb991a56c3df1b6d51b2409e36ba8181ce4' 2022-11-23T01:12:31.0873920Z Submodule path 'third_party/cudnn_frontend': checked out '171a7a986f7fbd9ed71bd0cf3c7ad4f55843d6b3' 2022-11-23T01:12:31.6098027Z Submodule path 'third_party/cutlass': checked out 'b72cbf957df8cf84a6d0ff91c190ad51a9c1d24a' 2022-11-23T01:12:31.9081433Z Submodule path 'third_party/eigen': checked out '3147391d946bb4b6c68edd901f2add6ac1f31f8c' 2022-11-23T01:12:31.9636189Z Submodule path 'third_party/fbgemm': checked out '4d1738b3142a6cb0c032cd639e239566010b054a' 2022-11-23T01:12:31.9652715Z Submodule 'third_party/asmjit' (https://github.com/asmjit/asmjit.git) registered for path 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:12:31.9655783Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo) registered for path 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:12:31.9658778Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:12:31.9662313Z Submodule 'third_party/hipify_torch' (https://github.com/ROCmSoftwarePlatform/hipify_torch.git) registered for path 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:12:31.9688912Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/asmjit'... 2022-11-23T01:12:32.9132930Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/cpuinfo'... 2022-11-23T01:12:33.4683375Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/googletest'... 2022-11-23T01:12:34.4539659Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/third_party/hipify_torch'... 2022-11-23T01:12:34.7751914Z Submodule path 'third_party/fbgemm/third_party/asmjit': checked out 'd3fbf7c9bc7c1d1365a94a45614b91c5a3706b81' 2022-11-23T01:12:34.8990696Z Submodule path 'third_party/fbgemm/third_party/cpuinfo': checked out 'ed8b86a253800bafdb7b25c5c399f91bff9cb1f3' 2022-11-23T01:12:34.9690925Z Submodule path 'third_party/fbgemm/third_party/googletest': checked out 'cbf019de22c8dd37b2108da35b2748fd702d1796' 2022-11-23T01:12:34.9806133Z Submodule path 'third_party/fbgemm/third_party/hipify_torch': checked out '1840658c184f3eeba787dae0f06c45756c1daaf5' 2022-11-23T01:12:35.0928863Z Submodule path 'third_party/flatbuffers': checked out 'd0cede9c90c5257537c293517a21376408b549fa' 2022-11-23T01:12:35.1343511Z Submodule path 'third_party/fmt': checked out '7bdf0628b1276379886c7f6dda2cef2b3b374f0b' 2022-11-23T01:12:35.1442738Z Submodule path 'third_party/foxi': checked out 'c278588e34e535f0bb8f00df3880d26928038cad' 2022-11-23T01:12:35.1896208Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2022-11-23T01:12:35.2180778Z Submodule path 'third_party/gloo': checked out '4a5e339b764261d20fc409071dc7a8b8989aa195' 2022-11-23T01:12:35.2716039Z Submodule path 'third_party/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2022-11-23T01:12:35.2842244Z Submodule path 'third_party/ideep': checked out '5ddc65efe0428bbce2942b3ce5e3ce15239abe2f' 2022-11-23T01:12:35.2858360Z Submodule 'mkl-dnn' (https://github.com/intel/mkl-dnn.git) registered for path 'third_party/ideep/mkl-dnn' 2022-11-23T01:12:35.2884883Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn'... 2022-11-23T01:12:44.1471369Z Submodule path 'third_party/ideep/mkl-dnn': checked out 'd19d0f795c60695bd32f894c6f01771b2dfbe24d' 2022-11-23T01:12:44.1492304Z Submodule 'third_party/oneDNN' (https://github.com/oneapi-src/oneDNN.git) registered for path 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:12:44.1520447Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn/third_party/oneDNN'... 2022-11-23T01:12:53.0891018Z Submodule path 'third_party/ideep/mkl-dnn/third_party/oneDNN': checked out '650085b2f3643aad05c629425983491d63b5c289' 2022-11-23T01:12:53.1008196Z Submodule path 'third_party/ios-cmake': checked out '8abaed637d56f1337d6e1d2c4026e25c1eade724' 2022-11-23T01:12:53.1175023Z Submodule path 'third_party/ittapi': checked out '5b8a7d7422611c3a0d799fb5fc5dd4abfae35b42' 2022-11-23T01:12:53.2286689Z Submodule path 'third_party/kineto': checked out '6c1629809068efd78a8d56b4aa479c7ec49ae562' 2022-11-23T01:12:53.2304790Z Submodule 'libkineto/third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:12:53.2307754Z Submodule 'libkineto/third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:12:53.2334229Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/fmt'... 2022-11-23T01:12:54.3641725Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/googletest'... 2022-11-23T01:12:55.4314087Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '2591ab91c3898c9f6544fff04660276537d32ffd' 2022-11-23T01:12:55.4954341Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347' 2022-11-23T01:12:55.5198252Z Submodule path 'third_party/nccl/nccl': checked out 'f89fd4777d2ef9229c039ff750ae21da01626f52' 2022-11-23T01:12:55.5353693Z Submodule path 'third_party/neon2sse': checked out '97a126f08ce318023be604d03f88bf0820a9464a' 2022-11-23T01:12:55.6649543Z Submodule path 'third_party/nlohmann': checked out '87cda1d6646592ac5866dc703c8e1839046a6806' 2022-11-23T01:12:55.9702003Z Submodule path 'third_party/onnx': checked out 'f7ee1ac60d06abe8e26c9b6bbe1e3db5286b614b' 2022-11-23T01:12:55.9734111Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/onnx/third_party/benchmark' 2022-11-23T01:12:55.9737364Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx/third_party/pybind11' 2022-11-23T01:12:55.9764660Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/benchmark'... 2022-11-23T01:12:56.3846834Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/pybind11'... 2022-11-23T01:12:57.3250923Z Submodule path 'third_party/onnx/third_party/benchmark': checked out '0d98dba29d66e93259db7daa53a9327df767a415' 2022-11-23T01:12:57.3620983Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'ffa346860b306c9bbfb341aed9c14c067751feb8' 2022-11-23T01:12:57.3795006Z Submodule path 'third_party/onnx-tensorrt': checked out 'c153211418a7c57ce071d9ce2a41f8d1c85a878f' 2022-11-23T01:12:57.3812804Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:12:57.3837627Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx'... 2022-11-23T01:12:59.2312188Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx': checked out '765f5ee823a67a866f4bd28a9860e81f3c811ce8' 2022-11-23T01:12:59.2333362Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:12:59.2380845Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:12:59.2381649Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark'... 2022-11-23T01:12:59.6464922Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11'... 2022-11-23T01:13:00.5295303Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark': checked out 'e776aa0275e293707b6a0901e0e8d8a8a3679508' 2022-11-23T01:13:00.6066667Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11': checked out 'a1041190c8b8ff0cd9e2f0752248ad5e3789ea0c' 2022-11-23T01:13:00.6083319Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:13:00.6109205Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang'... 2022-11-23T01:13:00.8668807Z Submodule path 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2022-11-23T01:13:00.8774407Z Submodule path 'third_party/pocketfft': checked out 'ea778e37710c07723435b1be58235996d1d43a5a' 2022-11-23T01:13:01.1958905Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2022-11-23T01:13:01.1981745Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:13:01.1985469Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/protobuf/third_party/googletest' 2022-11-23T01:13:01.2012889Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/benchmark'... 2022-11-23T01:13:02.5669161Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/googletest'... 2022-11-23T01:13:03.6538040Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2022-11-23T01:13:03.7355231Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2022-11-23T01:13:03.7451005Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2022-11-23T01:13:03.7574628Z Submodule path 'third_party/pthreadpool': checked out 'a134dd5d4cee80cce15db81a72e7f929d71dd413' 2022-11-23T01:13:03.7968469Z Submodule path 'third_party/pybind11': checked out '80dc998efced8ceb2be59756668a7e90e8bef917' 2022-11-23T01:13:03.8065747Z Submodule path 'third_party/python-enum': checked out '4cfedc426c4e2fc52e3f5c2b4297e15ed8d6b8c7' 2022-11-23T01:13:03.8393203Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2022-11-23T01:13:03.8497290Z Submodule path 'third_party/python-six': checked out '15e31431af97e5e64b80af0a3f598d382bcdd49a' 2022-11-23T01:13:03.9015240Z Submodule path 'third_party/sleef': checked out 'e0a003ee838b75d11763aa9c3ef17bf71a725bff' 2022-11-23T01:13:04.0338997Z Submodule path 'third_party/tbb': checked out 'a51a90bc609bb73db8ea13841b5cf7aa4344d4a9' 2022-11-23T01:13:04.0648201Z Submodule path 'third_party/tensorpipe': checked out '52791a2fd214b2a9dc5759d36725909c1daa7f2e' 2022-11-23T01:13:04.0665811Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:13:04.0669089Z Submodule 'third_party/libnop' (https://github.com/google/libnop.git) registered for path 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:13:04.0672948Z Submodule 'third_party/libuv' (https://github.com/libuv/libuv.git) registered for path 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:13:04.0676365Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:13:04.0703026Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/googletest'... 2022-11-23T01:13:05.1166136Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libnop'... 2022-11-23T01:13:05.3861976Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libuv'... 2022-11-23T01:13:06.7765386Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11'... 2022-11-23T01:13:07.8367727Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2022-11-23T01:13:07.8536636Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2022-11-23T01:13:07.9315513Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '1dff88e5161cba5c59276d2070d2e304e4dcb242' 2022-11-23T01:13:07.9641811Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2022-11-23T01:13:07.9658880Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:13:07.9685690Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11/tools/clang'... 2022-11-23T01:13:08.5081255Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2022-11-23T01:13:08.6665955Z Submodule path 'third_party/zstd': checked out 'aec56a52fbab207fc639a1937d1e708a282edca8' 2022-11-23T01:13:08.6707482Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2022-11-23T01:13:08.7031373Z Entering 'android/libs/fbjni' 2022-11-23T01:13:08.7073809Z Entering 'third_party/FP16' 2022-11-23T01:13:08.7117701Z Entering 'third_party/FXdiv' 2022-11-23T01:13:08.7160539Z Entering 'third_party/NNPACK' 2022-11-23T01:13:08.7203963Z Entering 'third_party/QNNPACK' 2022-11-23T01:13:08.7247100Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T01:13:08.7290536Z Entering 'third_party/XNNPACK' 2022-11-23T01:13:08.7343562Z Entering 'third_party/benchmark' 2022-11-23T01:13:08.7386271Z Entering 'third_party/cpuinfo' 2022-11-23T01:13:08.7428454Z Entering 'third_party/cub' 2022-11-23T01:13:08.7471774Z Entering 'third_party/cudnn_frontend' 2022-11-23T01:13:08.7521478Z Entering 'third_party/cutlass' 2022-11-23T01:13:08.7570723Z Entering 'third_party/eigen' 2022-11-23T01:13:08.7615325Z Entering 'third_party/fbgemm' 2022-11-23T01:13:08.7658688Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:13:08.7700399Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:13:08.7742153Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:13:08.7783426Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:13:08.7826458Z Entering 'third_party/flatbuffers' 2022-11-23T01:13:08.7871018Z Entering 'third_party/fmt' 2022-11-23T01:13:08.7913648Z Entering 'third_party/foxi' 2022-11-23T01:13:08.7955697Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:13:08.7999003Z Entering 'third_party/gloo' 2022-11-23T01:13:08.8041342Z Entering 'third_party/googletest' 2022-11-23T01:13:08.8085578Z Entering 'third_party/ideep' 2022-11-23T01:13:08.8127978Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T01:13:08.8171622Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:13:08.8220093Z Entering 'third_party/ios-cmake' 2022-11-23T01:13:08.8261785Z Entering 'third_party/ittapi' 2022-11-23T01:13:08.8303083Z Entering 'third_party/kineto' 2022-11-23T01:13:08.8344989Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:13:08.8386265Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:13:08.8430255Z Entering 'third_party/nccl/nccl' 2022-11-23T01:13:08.8472200Z Entering 'third_party/neon2sse' 2022-11-23T01:13:08.8514207Z Entering 'third_party/nlohmann' 2022-11-23T01:13:08.8558094Z Entering 'third_party/onnx' 2022-11-23T01:13:08.8612248Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T01:13:08.8654622Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T01:13:08.8698823Z Entering 'third_party/onnx-tensorrt' 2022-11-23T01:13:08.8739184Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:13:08.8786151Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:13:08.8827448Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:13:08.8869023Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:13:08.8917385Z Entering 'third_party/pocketfft' 2022-11-23T01:13:08.8959084Z Entering 'third_party/protobuf' 2022-11-23T01:13:08.9003972Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:13:08.9045411Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T01:13:08.9088214Z Entering 'third_party/psimd' 2022-11-23T01:13:08.9129981Z Entering 'third_party/pthreadpool' 2022-11-23T01:13:08.9171415Z Entering 'third_party/pybind11' 2022-11-23T01:13:08.9214280Z Entering 'third_party/python-enum' 2022-11-23T01:13:08.9256410Z Entering 'third_party/python-peachpy' 2022-11-23T01:13:08.9298292Z Entering 'third_party/python-six' 2022-11-23T01:13:08.9340127Z Entering 'third_party/sleef' 2022-11-23T01:13:08.9382228Z Entering 'third_party/tbb' 2022-11-23T01:13:08.9427912Z Entering 'third_party/tensorpipe' 2022-11-23T01:13:08.9471040Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:13:08.9512407Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:13:08.9553915Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:13:08.9595064Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:13:08.9635343Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:13:08.9679373Z Entering 'third_party/zstd' 2022-11-23T01:13:08.9732817Z ##[endgroup] 2022-11-23T01:13:08.9733542Z ##[group]Persisting credentials for submodules 2022-11-23T01:13:08.9738798Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || : 2022-11-23T01:13:09.0049569Z Entering 'android/libs/fbjni' 2022-11-23T01:13:09.0091789Z Entering 'third_party/FP16' 2022-11-23T01:13:09.0132704Z Entering 'third_party/FXdiv' 2022-11-23T01:13:09.0173758Z Entering 'third_party/NNPACK' 2022-11-23T01:13:09.0216101Z Entering 'third_party/QNNPACK' 2022-11-23T01:13:09.0257824Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T01:13:09.0298737Z Entering 'third_party/XNNPACK' 2022-11-23T01:13:09.0351229Z Entering 'third_party/benchmark' 2022-11-23T01:13:09.0392603Z Entering 'third_party/cpuinfo' 2022-11-23T01:13:09.0434346Z Entering 'third_party/cub' 2022-11-23T01:13:09.0475956Z Entering 'third_party/cudnn_frontend' 2022-11-23T01:13:09.0522339Z Entering 'third_party/cutlass' 2022-11-23T01:13:09.0570352Z Entering 'third_party/eigen' 2022-11-23T01:13:09.0613467Z Entering 'third_party/fbgemm' 2022-11-23T01:13:09.0655686Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:13:09.0696091Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:13:09.0737395Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:13:09.0777782Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:13:09.0820209Z Entering 'third_party/flatbuffers' 2022-11-23T01:13:09.0863590Z Entering 'third_party/fmt' 2022-11-23T01:13:09.0904832Z Entering 'third_party/foxi' 2022-11-23T01:13:09.0947050Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:13:09.0988717Z Entering 'third_party/gloo' 2022-11-23T01:13:09.1031669Z Entering 'third_party/googletest' 2022-11-23T01:13:09.1072937Z Entering 'third_party/ideep' 2022-11-23T01:13:09.1114000Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T01:13:09.1156786Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:13:09.1204330Z Entering 'third_party/ios-cmake' 2022-11-23T01:13:09.1245254Z Entering 'third_party/ittapi' 2022-11-23T01:13:09.1286176Z Entering 'third_party/kineto' 2022-11-23T01:13:09.1327407Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:13:09.1368713Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:13:09.1411463Z Entering 'third_party/nccl/nccl' 2022-11-23T01:13:09.1453628Z Entering 'third_party/neon2sse' 2022-11-23T01:13:09.1494288Z Entering 'third_party/nlohmann' 2022-11-23T01:13:09.1537797Z Entering 'third_party/onnx' 2022-11-23T01:13:09.1592025Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T01:13:09.1633578Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T01:13:09.1678191Z Entering 'third_party/onnx-tensorrt' 2022-11-23T01:13:09.1718482Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:13:09.1764737Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:13:09.1806050Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:13:09.1847702Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:13:09.1893487Z Entering 'third_party/pocketfft' 2022-11-23T01:13:09.1935003Z Entering 'third_party/protobuf' 2022-11-23T01:13:09.1979717Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:13:09.2020606Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T01:13:09.2063727Z Entering 'third_party/psimd' 2022-11-23T01:13:09.2104947Z Entering 'third_party/pthreadpool' 2022-11-23T01:13:09.2145447Z Entering 'third_party/pybind11' 2022-11-23T01:13:09.2186894Z Entering 'third_party/python-enum' 2022-11-23T01:13:09.2227735Z Entering 'third_party/python-peachpy' 2022-11-23T01:13:09.2269618Z Entering 'third_party/python-six' 2022-11-23T01:13:09.2310220Z Entering 'third_party/sleef' 2022-11-23T01:13:09.2350798Z Entering 'third_party/tbb' 2022-11-23T01:13:09.2394370Z Entering 'third_party/tensorpipe' 2022-11-23T01:13:09.2435258Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:13:09.2476031Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:13:09.2516211Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:13:09.2557126Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:13:09.2598021Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:13:09.2642918Z Entering 'third_party/zstd' 2022-11-23T01:13:09.2696068Z [command]/usr/bin/git submodule foreach --recursive git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url 2022-11-23T01:13:09.3005029Z Entering 'android/libs/fbjni' 2022-11-23T01:13:09.3042989Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2022-11-23T01:13:09.3059980Z Entering 'third_party/FP16' 2022-11-23T01:13:09.3099243Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2022-11-23T01:13:09.3116767Z Entering 'third_party/FXdiv' 2022-11-23T01:13:09.3154970Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2022-11-23T01:13:09.3171761Z Entering 'third_party/NNPACK' 2022-11-23T01:13:09.3210189Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2022-11-23T01:13:09.3227146Z Entering 'third_party/QNNPACK' 2022-11-23T01:13:09.3265236Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/QNNPACK/config remote.origin.url 2022-11-23T01:13:09.3283121Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T01:13:09.3321335Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2022-11-23T01:13:09.3338839Z Entering 'third_party/XNNPACK' 2022-11-23T01:13:09.3377687Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2022-11-23T01:13:09.3406148Z Entering 'third_party/benchmark' 2022-11-23T01:13:09.3444367Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2022-11-23T01:13:09.3461663Z Entering 'third_party/cpuinfo' 2022-11-23T01:13:09.3499095Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2022-11-23T01:13:09.3517367Z Entering 'third_party/cub' 2022-11-23T01:13:09.3556189Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cub/config remote.origin.url 2022-11-23T01:13:09.3573409Z Entering 'third_party/cudnn_frontend' 2022-11-23T01:13:09.3611935Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2022-11-23T01:13:09.3635207Z Entering 'third_party/cutlass' 2022-11-23T01:13:09.3674218Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2022-11-23T01:13:09.3698332Z Entering 'third_party/eigen' 2022-11-23T01:13:09.3736612Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/eigen/config remote.origin.url 2022-11-23T01:13:09.3756851Z Entering 'third_party/fbgemm' 2022-11-23T01:13:09.3795460Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2022-11-23T01:13:09.3812384Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:13:09.3850214Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/asmjit/config remote.origin.url 2022-11-23T01:13:09.3867286Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:13:09.3906602Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/cpuinfo/config remote.origin.url 2022-11-23T01:13:09.3923912Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:13:09.3962424Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/googletest/config remote.origin.url 2022-11-23T01:13:09.3979259Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:13:09.4016813Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/third_party/hipify_torch/config remote.origin.url 2022-11-23T01:13:09.4035061Z Entering 'third_party/flatbuffers' 2022-11-23T01:13:09.4073139Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2022-11-23T01:13:09.4092252Z Entering 'third_party/fmt' 2022-11-23T01:13:09.4131884Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2022-11-23T01:13:09.4149434Z Entering 'third_party/foxi' 2022-11-23T01:13:09.4187114Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/foxi/config remote.origin.url 2022-11-23T01:13:09.4204463Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:13:09.4242044Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2022-11-23T01:13:09.4259315Z Entering 'third_party/gloo' 2022-11-23T01:13:09.4297837Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2022-11-23T01:13:09.4315132Z Entering 'third_party/googletest' 2022-11-23T01:13:09.4353296Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2022-11-23T01:13:09.4370647Z Entering 'third_party/ideep' 2022-11-23T01:13:09.4409500Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2022-11-23T01:13:09.4425795Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T01:13:09.4464507Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2022-11-23T01:13:09.4483981Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:13:09.4522206Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/modules/third_party/oneDNN/config remote.origin.url 2022-11-23T01:13:09.4546588Z Entering 'third_party/ios-cmake' 2022-11-23T01:13:09.4584477Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ios-cmake/config remote.origin.url 2022-11-23T01:13:09.4601805Z Entering 'third_party/ittapi' 2022-11-23T01:13:09.4639226Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2022-11-23T01:13:09.4656430Z Entering 'third_party/kineto' 2022-11-23T01:13:09.4695573Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2022-11-23T01:13:09.4713337Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:13:09.4750914Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2022-11-23T01:13:09.4768488Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:13:09.4806709Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2022-11-23T01:13:09.4824940Z Entering 'third_party/nccl/nccl' 2022-11-23T01:13:09.4863501Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nccl/nccl/config remote.origin.url 2022-11-23T01:13:09.4881431Z Entering 'third_party/neon2sse' 2022-11-23T01:13:09.4919137Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/neon2sse/config remote.origin.url 2022-11-23T01:13:09.4936019Z Entering 'third_party/nlohmann' 2022-11-23T01:13:09.4974014Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2022-11-23T01:13:09.4993212Z Entering 'third_party/onnx' 2022-11-23T01:13:09.5030556Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2022-11-23T01:13:09.5060001Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T01:13:09.5098360Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/benchmark/config remote.origin.url 2022-11-23T01:13:09.5116188Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T01:13:09.5154088Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2022-11-23T01:13:09.5173416Z Entering 'third_party/onnx-tensorrt' 2022-11-23T01:13:09.5212811Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/config remote.origin.url 2022-11-23T01:13:09.5229208Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:13:09.5266829Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/config remote.origin.url 2022-11-23T01:13:09.5289297Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:13:09.5328844Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/benchmark/config remote.origin.url 2022-11-23T01:13:09.5346111Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:13:09.5385204Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2022-11-23T01:13:09.5402901Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:13:09.5442239Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx-tensorrt/modules/third_party/onnx/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2022-11-23T01:13:09.5464103Z Entering 'third_party/pocketfft' 2022-11-23T01:13:09.5501607Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2022-11-23T01:13:09.5519817Z Entering 'third_party/protobuf' 2022-11-23T01:13:09.5558245Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2022-11-23T01:13:09.5578362Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:13:09.5616405Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2022-11-23T01:13:09.5634147Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T01:13:09.5672409Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2022-11-23T01:13:09.5691116Z Entering 'third_party/psimd' 2022-11-23T01:13:09.5729652Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2022-11-23T01:13:09.5746469Z Entering 'third_party/pthreadpool' 2022-11-23T01:13:09.5784762Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2022-11-23T01:13:09.5801962Z Entering 'third_party/pybind11' 2022-11-23T01:13:09.5840293Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2022-11-23T01:13:09.5857830Z Entering 'third_party/python-enum' 2022-11-23T01:13:09.5895320Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-enum/config remote.origin.url 2022-11-23T01:13:09.5912902Z Entering 'third_party/python-peachpy' 2022-11-23T01:13:09.5950478Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2022-11-23T01:13:09.5968641Z Entering 'third_party/python-six' 2022-11-23T01:13:09.6006678Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-six/config remote.origin.url 2022-11-23T01:13:09.6023220Z Entering 'third_party/sleef' 2022-11-23T01:13:09.6062018Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2022-11-23T01:13:09.6079821Z Entering 'third_party/tbb' 2022-11-23T01:13:09.6117480Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tbb/config remote.origin.url 2022-11-23T01:13:09.6136259Z Entering 'third_party/tensorpipe' 2022-11-23T01:13:09.6174388Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2022-11-23T01:13:09.6192498Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:13:09.6229788Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2022-11-23T01:13:09.6246909Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:13:09.6284697Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2022-11-23T01:13:09.6301431Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:13:09.6339497Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2022-11-23T01:13:09.6357015Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:13:09.6395192Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2022-11-23T01:13:09.6411413Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:13:09.6449537Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2022-11-23T01:13:09.6469630Z Entering 'third_party/zstd' 2022-11-23T01:13:09.6507100Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/zstd/config remote.origin.url 2022-11-23T01:13:09.7643305Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2022-11-23T01:13:09.7953528Z Entering 'android/libs/fbjni' 2022-11-23T01:13:09.7996122Z Entering 'third_party/FP16' 2022-11-23T01:13:09.8039155Z Entering 'third_party/FXdiv' 2022-11-23T01:13:09.8081140Z Entering 'third_party/NNPACK' 2022-11-23T01:13:09.8123697Z Entering 'third_party/QNNPACK' 2022-11-23T01:13:09.8166401Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T01:13:09.8209557Z Entering 'third_party/XNNPACK' 2022-11-23T01:13:09.8263042Z Entering 'third_party/benchmark' 2022-11-23T01:13:09.8305418Z Entering 'third_party/cpuinfo' 2022-11-23T01:13:09.8348767Z Entering 'third_party/cub' 2022-11-23T01:13:09.8392578Z Entering 'third_party/cudnn_frontend' 2022-11-23T01:13:09.8440233Z Entering 'third_party/cutlass' 2022-11-23T01:13:09.8488701Z Entering 'third_party/eigen' 2022-11-23T01:13:09.8535169Z Entering 'third_party/fbgemm' 2022-11-23T01:13:09.8579127Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:13:09.8620667Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:13:09.8662811Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:13:09.8704461Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:13:09.8748126Z Entering 'third_party/flatbuffers' 2022-11-23T01:13:09.8793845Z Entering 'third_party/fmt' 2022-11-23T01:13:09.8836058Z Entering 'third_party/foxi' 2022-11-23T01:13:09.8879251Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:13:09.8920911Z Entering 'third_party/gloo' 2022-11-23T01:13:09.8963574Z Entering 'third_party/googletest' 2022-11-23T01:13:09.9006396Z Entering 'third_party/ideep' 2022-11-23T01:13:09.9049110Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T01:13:09.9094334Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:13:09.9143041Z Entering 'third_party/ios-cmake' 2022-11-23T01:13:09.9187312Z Entering 'third_party/ittapi' 2022-11-23T01:13:09.9229869Z Entering 'third_party/kineto' 2022-11-23T01:13:09.9272260Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:13:09.9315261Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:13:09.9358771Z Entering 'third_party/nccl/nccl' 2022-11-23T01:13:09.9400992Z Entering 'third_party/neon2sse' 2022-11-23T01:13:09.9443285Z Entering 'third_party/nlohmann' 2022-11-23T01:13:09.9487777Z Entering 'third_party/onnx' 2022-11-23T01:13:09.9542100Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T01:13:09.9584396Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T01:13:09.9628662Z Entering 'third_party/onnx-tensorrt' 2022-11-23T01:13:09.9672135Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:13:09.9719587Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:13:09.9763127Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:13:09.9804432Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:13:09.9850772Z Entering 'third_party/pocketfft' 2022-11-23T01:13:09.9893186Z Entering 'third_party/protobuf' 2022-11-23T01:13:09.9939590Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:13:09.9980974Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T01:13:10.0025973Z Entering 'third_party/psimd' 2022-11-23T01:13:10.0068160Z Entering 'third_party/pthreadpool' 2022-11-23T01:13:10.0110211Z Entering 'third_party/pybind11' 2022-11-23T01:13:10.0152621Z Entering 'third_party/python-enum' 2022-11-23T01:13:10.0194733Z Entering 'third_party/python-peachpy' 2022-11-23T01:13:10.0237189Z Entering 'third_party/python-six' 2022-11-23T01:13:10.0278537Z Entering 'third_party/sleef' 2022-11-23T01:13:10.0321654Z Entering 'third_party/tbb' 2022-11-23T01:13:10.0365461Z Entering 'third_party/tensorpipe' 2022-11-23T01:13:10.0407598Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:13:10.0449620Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:13:10.0491185Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:13:10.0533913Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:13:10.0575835Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:13:10.0620536Z Entering 'third_party/zstd' 2022-11-23T01:13:10.0675673Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2022-11-23T01:13:10.0985646Z Entering 'android/libs/fbjni' 2022-11-23T01:13:10.1027452Z Entering 'third_party/FP16' 2022-11-23T01:13:10.1069934Z Entering 'third_party/FXdiv' 2022-11-23T01:13:10.1112542Z Entering 'third_party/NNPACK' 2022-11-23T01:13:10.1155321Z Entering 'third_party/QNNPACK' 2022-11-23T01:13:10.1197618Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T01:13:10.1239320Z Entering 'third_party/XNNPACK' 2022-11-23T01:13:10.1292293Z Entering 'third_party/benchmark' 2022-11-23T01:13:10.1334483Z Entering 'third_party/cpuinfo' 2022-11-23T01:13:10.1377123Z Entering 'third_party/cub' 2022-11-23T01:13:10.1419200Z Entering 'third_party/cudnn_frontend' 2022-11-23T01:13:10.1467693Z Entering 'third_party/cutlass' 2022-11-23T01:13:10.1517052Z Entering 'third_party/eigen' 2022-11-23T01:13:10.1561202Z Entering 'third_party/fbgemm' 2022-11-23T01:13:10.1603535Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T01:13:10.1645818Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T01:13:10.1688657Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T01:13:10.1730312Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T01:13:10.1773885Z Entering 'third_party/flatbuffers' 2022-11-23T01:13:10.1818803Z Entering 'third_party/fmt' 2022-11-23T01:13:10.1861789Z Entering 'third_party/foxi' 2022-11-23T01:13:10.1903903Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T01:13:10.1945553Z Entering 'third_party/gloo' 2022-11-23T01:13:10.1988719Z Entering 'third_party/googletest' 2022-11-23T01:13:10.2031073Z Entering 'third_party/ideep' 2022-11-23T01:13:10.2073542Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T01:13:10.2117443Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T01:13:10.2167371Z Entering 'third_party/ios-cmake' 2022-11-23T01:13:10.2209592Z Entering 'third_party/ittapi' 2022-11-23T01:13:10.2253503Z Entering 'third_party/kineto' 2022-11-23T01:13:10.2295966Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T01:13:10.2339226Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T01:13:10.2381918Z Entering 'third_party/nccl/nccl' 2022-11-23T01:13:10.2425266Z Entering 'third_party/neon2sse' 2022-11-23T01:13:10.2466922Z Entering 'third_party/nlohmann' 2022-11-23T01:13:10.2509805Z Entering 'third_party/onnx' 2022-11-23T01:13:10.2565720Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T01:13:10.2608206Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T01:13:10.2652431Z Entering 'third_party/onnx-tensorrt' 2022-11-23T01:13:10.2693774Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T01:13:10.2740549Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T01:13:10.2782616Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T01:13:10.2824750Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T01:13:10.2872653Z Entering 'third_party/pocketfft' 2022-11-23T01:13:10.2914511Z Entering 'third_party/protobuf' 2022-11-23T01:13:10.2960721Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T01:13:10.3002024Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T01:13:10.3044996Z Entering 'third_party/psimd' 2022-11-23T01:13:10.3086203Z Entering 'third_party/pthreadpool' 2022-11-23T01:13:10.3127620Z Entering 'third_party/pybind11' 2022-11-23T01:13:10.3169858Z Entering 'third_party/python-enum' 2022-11-23T01:13:10.3212010Z Entering 'third_party/python-peachpy' 2022-11-23T01:13:10.3254222Z Entering 'third_party/python-six' 2022-11-23T01:13:10.3295515Z Entering 'third_party/sleef' 2022-11-23T01:13:10.3337772Z Entering 'third_party/tbb' 2022-11-23T01:13:10.3381353Z Entering 'third_party/tensorpipe' 2022-11-23T01:13:10.3423524Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T01:13:10.3465661Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T01:13:10.3507155Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T01:13:10.3548741Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T01:13:10.3589336Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T01:13:10.3634051Z Entering 'third_party/zstd' 2022-11-23T01:13:10.3685679Z ##[endgroup] 2022-11-23T01:13:10.3726621Z [command]/usr/bin/git log -1 --format='%H' 2022-11-23T01:13:10.3754341Z '1cfd3858ac54fe3883534309081631a0a892ba3f' 2022-11-23T01:13:10.3903833Z Prepare all required actions 2022-11-23T01:13:10.3935637Z ##[group]Run ./.github/actions/setup-linux 2022-11-23T01:13:10.3935921Z env: 2022-11-23T01:13:10.3936167Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:13:10.3936410Z ##[endgroup] 2022-11-23T01:13:10.3955112Z ##[group]Run set -euo pipefail 2022-11-23T01:13:10.3955430Z set -euo pipefail 2022-11-23T01:13:10.3955726Z function get_ec2_metadata() { 2022-11-23T01:13:10.3956054Z  # Pulled from instance metadata endpoint for EC2 2022-11-23T01:13:10.3956549Z  # see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html 2022-11-23T01:13:10.3956956Z  category=$1 2022-11-23T01:13:10.3957292Z  curl -fsSL "http://169.254.169.254/latest/meta-data/${category}" 2022-11-23T01:13:10.3957586Z } 2022-11-23T01:13:10.3957865Z echo "ami-id: $(get_ec2_metadata ami-id)" 2022-11-23T01:13:10.3958259Z echo "instance-id: $(get_ec2_metadata instance-id)" 2022-11-23T01:13:10.3958625Z echo "instance-type: $(get_ec2_metadata instance-type)" 2022-11-23T01:13:10.3958978Z echo "system info $(uname -a)" 2022-11-23T01:13:10.3971819Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:13:10.3972100Z env: 2022-11-23T01:13:10.3972345Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:13:10.3972607Z ##[endgroup] 2022-11-23T01:13:10.4069975Z ami-id: ami-096198a0bccc6bad4 2022-11-23T01:13:10.4133559Z instance-id: i-08478b31fddc5d09b 2022-11-23T01:13:10.4197342Z instance-type: g3.8xlarge 2022-11-23T01:13:10.4205171Z system info Linux ip-10-0-4-85.ec2.internal 4.14.252-195.483.amzn2.x86_64 #1 SMP Mon Nov 1 20:58:46 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux 2022-11-23T01:13:10.4223833Z ##[group]Run if systemctl is-active --quiet docker; then 2022-11-23T01:13:10.4224222Z if systemctl is-active --quiet docker; then 2022-11-23T01:13:10.4224569Z  echo "Docker daemon is running..."; 2022-11-23T01:13:10.4224846Z else 2022-11-23T01:13:10.4225173Z  echo "Starting docker deamon..." && sudo systemctl start docker; 2022-11-23T01:13:10.4225499Z fi 2022-11-23T01:13:10.4237229Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:13:10.4237531Z env: 2022-11-23T01:13:10.4237778Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:13:10.4238024Z ##[endgroup] 2022-11-23T01:13:10.4288760Z Docker daemon is running... 2022-11-23T01:13:10.4306587Z ##[group]Run AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") 2022-11-23T01:13:10.4307070Z AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") 2022-11-23T01:13:10.4307457Z retry () { "$@" || (sleep 1 && "$@") || (sleep 2 && "$@") } 2022-11-23T01:13:10.4307949Z retry aws ecr get-login*** "$AWS_DEFAULT_REGION" | docker login --username AWS \ 2022-11-23T01:13:10.4308431Z  --password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" 2022-11-23T01:13:10.4319854Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:13:10.4320162Z env: 2022-11-23T01:13:10.4320408Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:13:10.4320664Z AWS_RETRY_MODE: standard 2022-11-23T01:13:10.4320929Z AWS_MAX_ATTEMPTS: 5 2022-11-23T01:13:10.4321255Z AWS_DEFAULT_REGION: us-east-1 2022-11-23T01:13:10.4321498Z ##[endgroup] 2022-11-23T01:13:11.3676826Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2022-11-23T01:13:11.3677413Z Configure a credential helper to remove this warning. See 2022-11-23T01:13:11.3677966Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2022-11-23T01:13:11.3678223Z 2022-11-23T01:13:11.3678843Z Login Succeeded 2022-11-23T01:13:11.3757259Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2022-11-23T01:13:11.3757677Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2022-11-23T01:13:11.3758153Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2022-11-23T01:13:11.3770499Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:13:11.3770801Z env: 2022-11-23T01:13:11.3771047Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:13:11.3771293Z ##[endgroup] 2022-11-23T01:13:11.3858841Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2022-11-23T01:13:11.3859200Z with: 2022-11-23T01:13:11.3859674Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:13:11.3860148Z env: 2022-11-23T01:13:11.3860395Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:13:11.3860638Z ##[endgroup] 2022-11-23T01:13:11.3876203Z ##[group]Run retry () { "$@" || (sleep 1 && "$@") || (sleep 2 && "$@") } 2022-11-23T01:13:11.3876570Z retry () { "$@" || (sleep 1 && "$@") || (sleep 2 && "$@") } 2022-11-23T01:13:11.3876938Z # ignore output since only exit code is used for conditional 2022-11-23T01:13:11.3877325Z # only pull docker image if it's not available locally 2022-11-23T01:13:11.3877724Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2022-11-23T01:13:11.3878143Z  retry docker pull "${DOCKER_IMAGE}" 2022-11-23T01:13:11.3878401Z fi 2022-11-23T01:13:11.3889606Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:13:11.3889886Z env: 2022-11-23T01:13:11.3890128Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:13:11.3890642Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:13:11.3891125Z ##[endgroup] 2022-11-23T01:13:11.6364861Z 072aae4a77ed7d3a69ad5683420509c41301b940: Pulling from pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7 2022-11-23T01:13:11.6365375Z a404e5416296: Pulling fs layer 2022-11-23T01:13:11.6365662Z c58c079e9b17: Pulling fs layer 2022-11-23T01:13:11.6365947Z e5b80b8bbe91: Pulling fs layer 2022-11-23T01:13:11.6367324Z 888240790290: Pulling fs layer 2022-11-23T01:13:11.6367675Z 515fe5e34eb4: Pulling fs layer 2022-11-23T01:13:11.6368187Z 4e4521f12f5a: Pulling fs layer 2022-11-23T01:13:11.6370469Z f6e1a56cb32d: Pulling fs layer 2022-11-23T01:13:11.6370790Z c29b96e36bd0: Pulling fs layer 2022-11-23T01:13:11.6371079Z 304d3c6c28d0: Pulling fs layer 2022-11-23T01:13:11.6371368Z fac00e927cfe: Pulling fs layer 2022-11-23T01:13:11.6372974Z f0158c8d8420: Pulling fs layer 2022-11-23T01:13:11.6373282Z 3ceac802dd07: Pulling fs layer 2022-11-23T01:13:11.6373622Z 0d0e625ba887: Pulling fs layer 2022-11-23T01:13:11.6374096Z bc2be817cb7e: Pulling fs layer 2022-11-23T01:13:11.6374394Z 11eb2106b948: Pulling fs layer 2022-11-23T01:13:11.6374650Z 888240790290: Waiting 2022-11-23T01:13:11.6374889Z 515fe5e34eb4: Waiting 2022-11-23T01:13:11.6375146Z 34fa4193c7a6: Pulling fs layer 2022-11-23T01:13:11.6375421Z a7cf5b3894f8: Pulling fs layer 2022-11-23T01:13:11.6375662Z 0d0e625ba887: Waiting 2022-11-23T01:13:11.6375925Z 3f6b06edd3f5: Pulling fs layer 2022-11-23T01:13:11.6376212Z 73a2b1f75a3d: Pulling fs layer 2022-11-23T01:13:11.6376562Z f6e1a56cb32d: Waiting 2022-11-23T01:13:11.6376952Z a7cf5b3894f8: Waiting 2022-11-23T01:13:11.6377206Z c29b96e36bd0: Waiting 2022-11-23T01:13:11.6377446Z ba6235196410: Pulling fs layer 2022-11-23T01:13:11.6377706Z 11eb2106b948: Waiting 2022-11-23T01:13:11.6377943Z f0158c8d8420: Waiting 2022-11-23T01:13:11.6378178Z 879cdaf83543: Pulling fs layer 2022-11-23T01:13:11.6378444Z bc2be817cb7e: Waiting 2022-11-23T01:13:11.6379961Z 6ff0fc00b0a9: Pulling fs layer 2022-11-23T01:13:11.6380339Z a58b9ed071f4: Pulling fs layer 2022-11-23T01:13:11.6380614Z ba6235196410: Waiting 2022-11-23T01:13:11.6380865Z fac00e927cfe: Waiting 2022-11-23T01:13:11.6381091Z 879cdaf83543: Waiting 2022-11-23T01:13:11.6381364Z a8c562f6a1cf: Pulling fs layer 2022-11-23T01:13:11.6381637Z 0a39b4492650: Pulling fs layer 2022-11-23T01:13:11.6381877Z a8c562f6a1cf: Waiting 2022-11-23T01:13:11.6382135Z 9088ff8de269: Pulling fs layer 2022-11-23T01:13:11.6382393Z a58b9ed071f4: Waiting 2022-11-23T01:13:11.6382805Z 165006759af3: Pulling fs layer 2022-11-23T01:13:11.6383056Z 9088ff8de269: Waiting 2022-11-23T01:13:11.6383317Z 3ceac802dd07: Waiting 2022-11-23T01:13:11.6383642Z ae48b7377a0d: Pulling fs layer 2022-11-23T01:13:11.6383930Z b18965f4b6f1: Pulling fs layer 2022-11-23T01:13:11.6384184Z 34fa4193c7a6: Waiting 2022-11-23T01:13:11.6384423Z 102ddcd90753: Pulling fs layer 2022-11-23T01:13:11.6384693Z 5f5dd1cba120: Pulling fs layer 2022-11-23T01:13:11.6384969Z 8a7f50c8b503: Pulling fs layer 2022-11-23T01:13:11.6385220Z 863c35620b44: Pulling fs layer 2022-11-23T01:13:11.6385470Z 165006759af3: Waiting 2022-11-23T01:13:11.6385712Z 304d3c6c28d0: Waiting 2022-11-23T01:13:11.6385966Z 183e4209dc37: Pulling fs layer 2022-11-23T01:13:11.6386203Z ae48b7377a0d: Waiting 2022-11-23T01:13:11.6386444Z 73a2b1f75a3d: Waiting 2022-11-23T01:13:11.6386699Z a47cba6c334e: Pulling fs layer 2022-11-23T01:13:11.6386950Z a9f3d4742233: Pulling fs layer 2022-11-23T01:13:11.6387218Z 3cefa8a4607f: Pulling fs layer 2022-11-23T01:13:11.6387478Z 863c35620b44: Waiting 2022-11-23T01:13:11.6387696Z 183e4209dc37: Waiting 2022-11-23T01:13:11.6387929Z b18965f4b6f1: Waiting 2022-11-23T01:13:11.6388167Z a9f3d4742233: Waiting 2022-11-23T01:13:11.6388399Z 023a41fa48e6: Pulling fs layer 2022-11-23T01:13:11.6388653Z 3cefa8a4607f: Waiting 2022-11-23T01:13:11.6388893Z a47cba6c334e: Waiting 2022-11-23T01:13:11.6389613Z 96e251412f4d: Pulling fs layer 2022-11-23T01:13:11.6389870Z 023a41fa48e6: Waiting 2022-11-23T01:13:11.6390107Z 5f5dd1cba120: Waiting 2022-11-23T01:13:11.6390343Z 49d40c00cf56: Pulling fs layer 2022-11-23T01:13:11.6390611Z 7e2d6313145f: Pulling fs layer 2022-11-23T01:13:11.6390874Z 96805775a692: Pulling fs layer 2022-11-23T01:13:11.6391126Z 102ddcd90753: Waiting 2022-11-23T01:13:11.6391383Z 75f1ead35ace: Pulling fs layer 2022-11-23T01:13:11.6391639Z 7e2d6313145f: Waiting 2022-11-23T01:13:11.6391859Z 49d40c00cf56: Waiting 2022-11-23T01:13:11.6392107Z 793c37004dab: Pulling fs layer 2022-11-23T01:13:11.6392377Z cadc5661750d: Pulling fs layer 2022-11-23T01:13:11.6392638Z 6386b2adbe28: Pulling fs layer 2022-11-23T01:13:11.6392909Z 74aa250bc82f: Pulling fs layer 2022-11-23T01:13:11.6393162Z 793c37004dab: Waiting 2022-11-23T01:13:11.6393389Z 8a7f50c8b503: Waiting 2022-11-23T01:13:11.6393642Z 436525efe61d: Pulling fs layer 2022-11-23T01:13:11.6393896Z cadc5661750d: Waiting 2022-11-23T01:13:11.6394152Z 596be1fe0bda: Pulling fs layer 2022-11-23T01:13:11.6394407Z 772fa4efddc3: Pulling fs layer 2022-11-23T01:13:11.6394676Z 91ddf385377b: Pulling fs layer 2022-11-23T01:13:11.6394929Z 6386b2adbe28: Waiting 2022-11-23T01:13:11.6395167Z 9f7cfb895784: Pulling fs layer 2022-11-23T01:13:11.6395418Z 436525efe61d: Waiting 2022-11-23T01:13:11.6395667Z 8b8218af0479: Pulling fs layer 2022-11-23T01:13:11.6395902Z 91ddf385377b: Waiting 2022-11-23T01:13:11.6396139Z 9f7cfb895784: Waiting 2022-11-23T01:13:11.6396373Z 8b8218af0479: Waiting 2022-11-23T01:13:11.6396589Z 74aa250bc82f: Waiting 2022-11-23T01:13:11.7892168Z c58c079e9b17: Download complete 2022-11-23T01:13:11.8769909Z 888240790290: Verifying Checksum 2022-11-23T01:13:11.8770364Z 888240790290: Download complete 2022-11-23T01:13:11.9332731Z e5b80b8bbe91: Verifying Checksum 2022-11-23T01:13:11.9333083Z e5b80b8bbe91: Download complete 2022-11-23T01:13:11.9517603Z 515fe5e34eb4: Download complete 2022-11-23T01:13:11.9988548Z a404e5416296: Verifying Checksum 2022-11-23T01:13:11.9988852Z a404e5416296: Download complete 2022-11-23T01:13:12.0471074Z f6e1a56cb32d: Download complete 2022-11-23T01:13:12.1347734Z 304d3c6c28d0: Verifying Checksum 2022-11-23T01:13:12.1348095Z 304d3c6c28d0: Download complete 2022-11-23T01:13:12.2124756Z fac00e927cfe: Verifying Checksum 2022-11-23T01:13:12.2126026Z fac00e927cfe: Download complete 2022-11-23T01:13:12.7773957Z a404e5416296: Pull complete 2022-11-23T01:13:13.0748201Z c58c079e9b17: Pull complete 2022-11-23T01:13:13.6145230Z e5b80b8bbe91: Pull complete 2022-11-23T01:13:13.7320434Z 888240790290: Pull complete 2022-11-23T01:13:13.8408159Z 515fe5e34eb4: Pull complete 2022-11-23T01:13:14.2614354Z f0158c8d8420: Verifying Checksum 2022-11-23T01:13:14.2614710Z f0158c8d8420: Download complete 2022-11-23T01:13:14.3383704Z 3ceac802dd07: Download complete 2022-11-23T01:13:14.4148970Z 0d0e625ba887: Download complete 2022-11-23T01:13:14.5006737Z bc2be817cb7e: Verifying Checksum 2022-11-23T01:13:14.5007056Z bc2be817cb7e: Download complete 2022-11-23T01:13:15.2233417Z 11eb2106b948: Verifying Checksum 2022-11-23T01:13:15.2233970Z 11eb2106b948: Download complete 2022-11-23T01:13:15.3244989Z 34fa4193c7a6: Verifying Checksum 2022-11-23T01:13:15.3245304Z 34fa4193c7a6: Download complete 2022-11-23T01:13:15.3980943Z a7cf5b3894f8: Verifying Checksum 2022-11-23T01:13:15.3981519Z a7cf5b3894f8: Download complete 2022-11-23T01:13:23.1555616Z 4e4521f12f5a: Verifying Checksum 2022-11-23T01:13:23.1556277Z 4e4521f12f5a: Download complete 2022-11-23T01:13:23.2283831Z 73a2b1f75a3d: Download complete 2022-11-23T01:13:23.2975662Z ba6235196410: Verifying Checksum 2022-11-23T01:13:23.2976008Z ba6235196410: Download complete 2022-11-23T01:13:23.3677813Z 879cdaf83543: Verifying Checksum 2022-11-23T01:13:23.3678481Z 879cdaf83543: Download complete 2022-11-23T01:13:23.4435340Z 6ff0fc00b0a9: Verifying Checksum 2022-11-23T01:13:23.4435952Z 6ff0fc00b0a9: Download complete 2022-11-23T01:13:23.5170042Z a58b9ed071f4: Verifying Checksum 2022-11-23T01:13:23.5170408Z a58b9ed071f4: Download complete 2022-11-23T01:13:23.5923984Z a8c562f6a1cf: Download complete 2022-11-23T01:13:24.7152567Z 0a39b4492650: Verifying Checksum 2022-11-23T01:13:24.7152924Z 0a39b4492650: Download complete 2022-11-23T01:13:24.8282789Z 9088ff8de269: Verifying Checksum 2022-11-23T01:13:24.8283178Z 9088ff8de269: Download complete 2022-11-23T01:13:24.8986858Z 165006759af3: Verifying Checksum 2022-11-23T01:13:24.8987184Z 165006759af3: Download complete 2022-11-23T01:13:24.9984273Z ae48b7377a0d: Verifying Checksum 2022-11-23T01:13:24.9984639Z ae48b7377a0d: Download complete 2022-11-23T01:13:25.0775466Z b18965f4b6f1: Verifying Checksum 2022-11-23T01:13:25.0776081Z b18965f4b6f1: Download complete 2022-11-23T01:13:25.1714705Z 102ddcd90753: Verifying Checksum 2022-11-23T01:13:25.1715305Z 102ddcd90753: Download complete 2022-11-23T01:13:26.5679582Z c29b96e36bd0: Verifying Checksum 2022-11-23T01:13:26.5679934Z c29b96e36bd0: Download complete 2022-11-23T01:13:26.6390155Z 8a7f50c8b503: Verifying Checksum 2022-11-23T01:13:26.6390511Z 8a7f50c8b503: Download complete 2022-11-23T01:13:26.7139207Z 863c35620b44: Verifying Checksum 2022-11-23T01:13:26.7139539Z 863c35620b44: Download complete 2022-11-23T01:13:27.1525964Z 183e4209dc37: Verifying Checksum 2022-11-23T01:13:27.1526300Z 183e4209dc37: Download complete 2022-11-23T01:13:27.1556734Z 5f5dd1cba120: Verifying Checksum 2022-11-23T01:13:27.1557042Z 5f5dd1cba120: Download complete 2022-11-23T01:13:27.2235136Z a47cba6c334e: Verifying Checksum 2022-11-23T01:13:27.2235448Z a47cba6c334e: Download complete 2022-11-23T01:13:27.2251617Z a9f3d4742233: Verifying Checksum 2022-11-23T01:13:27.2251913Z a9f3d4742233: Download complete 2022-11-23T01:13:27.2958178Z 023a41fa48e6: Verifying Checksum 2022-11-23T01:13:27.2958534Z 023a41fa48e6: Download complete 2022-11-23T01:13:27.4837702Z 3cefa8a4607f: Verifying Checksum 2022-11-23T01:13:27.4838091Z 3cefa8a4607f: Download complete 2022-11-23T01:13:27.5610015Z 49d40c00cf56: Verifying Checksum 2022-11-23T01:13:27.5610327Z 49d40c00cf56: Download complete 2022-11-23T01:13:27.6285740Z 7e2d6313145f: Download complete 2022-11-23T01:13:27.7459868Z 96e251412f4d: Verifying Checksum 2022-11-23T01:13:27.7460210Z 96e251412f4d: Download complete 2022-11-23T01:13:27.8186304Z 75f1ead35ace: Download complete 2022-11-23T01:13:27.8870202Z 793c37004dab: Verifying Checksum 2022-11-23T01:13:27.8870853Z 793c37004dab: Download complete 2022-11-23T01:13:27.9646851Z cadc5661750d: Verifying Checksum 2022-11-23T01:13:27.9647217Z cadc5661750d: Download complete 2022-11-23T01:13:28.0558090Z 6386b2adbe28: Verifying Checksum 2022-11-23T01:13:28.0558444Z 6386b2adbe28: Download complete 2022-11-23T01:13:28.2419144Z 74aa250bc82f: Verifying Checksum 2022-11-23T01:13:28.2419690Z 74aa250bc82f: Download complete 2022-11-23T01:13:28.3071544Z 436525efe61d: Verifying Checksum 2022-11-23T01:13:28.3072296Z 436525efe61d: Download complete 2022-11-23T01:13:28.9000294Z 596be1fe0bda: Verifying Checksum 2022-11-23T01:13:28.9000710Z 596be1fe0bda: Download complete 2022-11-23T01:13:28.9618738Z 772fa4efddc3: Verifying Checksum 2022-11-23T01:13:28.9619097Z 772fa4efddc3: Download complete 2022-11-23T01:13:32.2822514Z 96805775a692: Verifying Checksum 2022-11-23T01:13:32.2822902Z 96805775a692: Download complete 2022-11-23T01:13:32.3688894Z 9f7cfb895784: Verifying Checksum 2022-11-23T01:13:32.3689542Z 9f7cfb895784: Download complete 2022-11-23T01:13:32.4600019Z 8b8218af0479: Verifying Checksum 2022-11-23T01:13:32.4600604Z 8b8218af0479: Download complete 2022-11-23T01:13:35.1829987Z 3f6b06edd3f5: Verifying Checksum 2022-11-23T01:13:35.1830388Z 3f6b06edd3f5: Download complete 2022-11-23T01:13:37.3767179Z 4e4521f12f5a: Pull complete 2022-11-23T01:13:37.4809651Z f6e1a56cb32d: Pull complete 2022-11-23T01:13:58.9707232Z 91ddf385377b: Verifying Checksum 2022-11-23T01:13:58.9707568Z 91ddf385377b: Download complete 2022-11-23T01:13:59.5273421Z c29b96e36bd0: Pull complete 2022-11-23T01:14:01.3879928Z 304d3c6c28d0: Pull complete 2022-11-23T01:14:03.2522031Z fac00e927cfe: Pull complete 2022-11-23T01:14:11.2121941Z f0158c8d8420: Pull complete 2022-11-23T01:14:13.0563146Z 3ceac802dd07: Pull complete 2022-11-23T01:14:14.9064185Z 0d0e625ba887: Pull complete 2022-11-23T01:14:16.7861080Z bc2be817cb7e: Pull complete 2022-11-23T01:14:20.9661639Z 11eb2106b948: Pull complete 2022-11-23T01:14:23.2415236Z 34fa4193c7a6: Pull complete 2022-11-23T01:14:24.9865471Z a7cf5b3894f8: Pull complete 2022-11-23T01:15:00.7292469Z 3f6b06edd3f5: Pull complete 2022-11-23T01:15:00.8461160Z 73a2b1f75a3d: Pull complete 2022-11-23T01:15:00.9399391Z ba6235196410: Pull complete 2022-11-23T01:15:01.0466705Z 879cdaf83543: Pull complete 2022-11-23T01:15:01.1484898Z 6ff0fc00b0a9: Pull complete 2022-11-23T01:15:01.2535111Z a58b9ed071f4: Pull complete 2022-11-23T01:15:01.3566256Z a8c562f6a1cf: Pull complete 2022-11-23T01:15:03.6456532Z 0a39b4492650: Pull complete 2022-11-23T01:15:03.7483706Z 9088ff8de269: Pull complete 2022-11-23T01:15:03.8364803Z 165006759af3: Pull complete 2022-11-23T01:15:03.9751762Z ae48b7377a0d: Pull complete 2022-11-23T01:15:04.0804205Z b18965f4b6f1: Pull complete 2022-11-23T01:15:04.1978197Z 102ddcd90753: Pull complete 2022-11-23T01:15:11.4725403Z 5f5dd1cba120: Pull complete 2022-11-23T01:15:13.2914962Z 8a7f50c8b503: Pull complete 2022-11-23T01:15:15.1692350Z 863c35620b44: Pull complete 2022-11-23T01:15:17.9818900Z 183e4209dc37: Pull complete 2022-11-23T01:15:19.7879465Z a47cba6c334e: Pull complete 2022-11-23T01:15:21.7574258Z a9f3d4742233: Pull complete 2022-11-23T01:15:25.0939133Z 3cefa8a4607f: Pull complete 2022-11-23T01:15:28.2352066Z 023a41fa48e6: Pull complete 2022-11-23T01:15:32.9137447Z 96e251412f4d: Pull complete 2022-11-23T01:15:34.7863551Z 49d40c00cf56: Pull complete 2022-11-23T01:15:37.7744338Z 7e2d6313145f: Pull complete 2022-11-23T01:15:46.0245175Z 96805775a692: Pull complete 2022-11-23T01:15:48.8760689Z 75f1ead35ace: Pull complete 2022-11-23T01:15:50.5990722Z 793c37004dab: Pull complete 2022-11-23T01:15:52.4467218Z cadc5661750d: Pull complete 2022-11-23T01:15:54.5232681Z 6386b2adbe28: Pull complete 2022-11-23T01:15:57.7887308Z 74aa250bc82f: Pull complete 2022-11-23T01:15:59.6706896Z 436525efe61d: Pull complete 2022-11-23T01:16:03.4587738Z 596be1fe0bda: Pull complete 2022-11-23T01:16:03.6680380Z 772fa4efddc3: Pull complete 2022-11-23T01:16:45.8710743Z 91ddf385377b: Pull complete 2022-11-23T01:16:47.7499522Z 9f7cfb895784: Pull complete 2022-11-23T01:16:49.6293439Z 8b8218af0479: Pull complete 2022-11-23T01:16:50.9771890Z Digest: sha256:3a5626edfb2c43fb24303351be75287af92426b6bb7c6df2defc98f980346c6a 2022-11-23T01:16:51.4779714Z Status: Downloaded newer image for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:16:51.6874489Z 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:16:51.7658811Z ##[group]Run pytorch/test-infra/.github/actions/setup-nvidia@main 2022-11-23T01:16:51.7659156Z with: 2022-11-23T01:16:51.7659407Z driver-version: 515.76 2022-11-23T01:16:51.7659662Z env: 2022-11-23T01:16:51.7659888Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:16:51.7660157Z ##[endgroup] 2022-11-23T01:16:51.8368765Z ##[group]Run nick-fields/retry@7d4a37704547a311dbb66ebdf5b23ec19374a767 2022-11-23T01:16:51.8369070Z with: 2022-11-23T01:16:51.8369311Z timeout_minutes: 10 2022-11-23T01:16:51.8369570Z max_attempts: 3 2022-11-23T01:16:51.8375708Z command: # Is it disgusting to have a full shell script here in this github action? Sure # But is it the best way to make it so that this action relies on nothing else? Absolutely set -eou pipefail DISTRIBUTION=$(. /etc/os-release;echo $ID$VERSION_ID) DRIVER_FN="NVIDIA-Linux-x86_64-${DRIVER_VERSION}.run" YUM_REPO_URL="https://nvidia.github.io/nvidia-docker/${DISTRIBUTION}/nvidia-docker.repo" install_nvidia_docker2_amzn2() { ( set -x # Needed for yum-config-manager sudo yum install -y yum-utils sudo yum-config-manager --add-repo "${YUM_REPO_URL}" sudo yum install -y nvidia-docker2 sudo systemctl restart docker ) } install_nvidia_driver_amzn2() { ( set -x # Purge any nvidia driver installed from RHEL repo sudo yum remove -y nvidia-driver-latest-dkms # Try to gather more information about the runner and its existing NVIDIA driver if any echo "Before installing NVIDIA driver" lspci lsmod modinfo nvidia || true HAS_NVIDIA_DRIVER=0 # Check if NVIDIA driver has already been installed if [ -x "$(command -v nvidia-smi)" ]; then set +e # The driver exists, check its version next. Also check only the first GPU if there are more than one of them # so that the same driver version is not print over multiple lines INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then echo "Failed to get NVIDIA driver version ($INSTALLED_DRIVER_VERSION). Continuing" elif [ "$INSTALLED_DRIVER_VERSION" != "$DRIVER_VERSION" ]; then echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has been installed, but we expect to have $DRIVER_VERSION instead. Continuing" else HAS_NVIDIA_DRIVER=1 echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has already been installed. Skipping NVIDIA driver installation" fi set -e fi if [ "$HAS_NVIDIA_DRIVER" -eq 0 ]; then sudo yum groupinstall -y "Development Tools" # ensure our kernel install is the same as our underlying kernel, # groupinstall "Development Tools" has a habit of mismatching kernel headers sudo yum install -y "kernel-devel-uname-r == $(uname -r)" sudo modprobe backlight sudo curl -fsL -o /tmp/nvidia_driver "https://s3.amazonaws.com/ossci-linux/nvidia_driver/$DRIVER_FN" set +e sudo /bin/bash /tmp/nvidia_driver -s --no-drm NVIDIA_INSTALLATION_STATUS=$? RESET_GPU=0 if [ "$NVIDIA_INSTALLATION_STATUS" -ne 0 ]; then sudo cat /var/log/nvidia-installer.log # Fail to install NVIDIA driver, try to reset the GPU RESET_GPU=1 elif [ -x "$(command -v nvidia-smi)" ]; then # Check again if nvidia-smi works even if the driver installation completes successfully INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then RESET_GPU=1 fi fi if [ "$RESET_GPU" -eq 1 ]; then NVIDIA_DEVICES=$(lspci -D | grep -i NVIDIA | cut -d' ' -f1) # The GPU can get stuck in a failure state if somehow the test crashs the GPU microcode. When this # happens, we'll try to reset all NVIDIA devices https://github.com/pytorch/pytorch/issues/88388 for PCI_ID in $NVIDIA_DEVICES; do DEVICE_ENABLED=$(cat /sys/bus/pci/devices/$PCI_ID/enable) echo "Reseting $PCI_ID (enabled state: $DEVICE_ENABLED)" # This requires sudo permission of course echo "1" | sudo tee /sys/bus/pci/devices/$PCI_ID/reset sleep 1 done fi sudo rm -fv /tmp/nvidia_driver set -e fi sudo modprobe nvidia || true echo "After installing NVIDIA driver" lspci lsmod modinfo nvidia || true ( set +e nvidia-smi NVIDIA_SMI_STATUS=$? # Allowable exit statuses for nvidia-smi, see: https://github.com/NVIDIA/gpu-operator/issues/285 if [ "$NVIDIA_SMI_STATUS" -eq 0 ] || [ "$NVIDIA_SMI_STATUS" -eq 14 ]; then echo "INFO: Ignoring allowed status ${NVIDIA_SMI_STATUS}" else echo "ERROR: nvidia-smi exited with unresolved status ${NVIDIA_SMI_STATUS}" exit ${NVIDIA_SMI_STATUS} fi set -e ) ) } echo "== Installing nvidia driver ${DRIVER_FN} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_driver_amzn2 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac # Install container toolkit based on distribution echo "== Installing nvidia container toolkit for ${DISTRIBUTION} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_docker2_amzn2 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac echo "GPU_FLAG=--gpus all" >> "${GITHUB_ENV}" 2022-11-23T01:16:51.8382058Z retry_wait_seconds: 10 2022-11-23T01:16:51.8382325Z polling_interval_seconds: 1 2022-11-23T01:16:51.8382609Z warning_on_retry: true 2022-11-23T01:16:51.8382885Z continue_on_error: false 2022-11-23T01:16:51.8383115Z env: 2022-11-23T01:16:51.8383366Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:16:51.8383641Z DRIVER_VERSION: 515.76 2022-11-23T01:16:51.8383878Z ##[endgroup] 2022-11-23T01:16:51.8977425Z 2022-11-23T01:16:51.8999477Z ##[warning]The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/ 2022-11-23T01:16:51.9043804Z == Installing nvidia driver NVIDIA-Linux-x86_64-515.76.run == 2022-11-23T01:16:51.9046031Z + sudo yum remove -y nvidia-driver-latest-dkms 2022-11-23T01:16:52.3725953Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:16:52.4152339Z No Match for argument: nvidia-driver-latest-dkms 2022-11-23T01:16:52.4441166Z No Packages marked for removal 2022-11-23T01:16:52.4601191Z + echo 'Before installing NVIDIA driver' 2022-11-23T01:16:52.4601492Z + lspci 2022-11-23T01:16:52.4601750Z Before installing NVIDIA driver 2022-11-23T01:16:53.6969830Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02) 2022-11-23T01:16:53.6970301Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2022-11-23T01:16:53.6970743Z 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] 2022-11-23T01:16:53.6972023Z 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 01) 2022-11-23T01:16:53.6972468Z 00:02.0 VGA compatible controller: Cirrus Logic GD 5446 2022-11-23T01:16:53.6973129Z 00:03.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2022-11-23T01:16:53.6973722Z 00:1d.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:16:53.6974204Z 00:1e.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:16:53.6974656Z 00:1f.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01) 2022-11-23T01:16:53.6974971Z + lsmod 2022-11-23T01:16:53.6996695Z Module Size Used by 2022-11-23T01:16:53.6997288Z xt_conntrack 16384 1 2022-11-23T01:16:53.6997845Z ipt_MASQUERADE 16384 1 2022-11-23T01:16:53.6998334Z nf_nat_masquerade_ipv4 16384 1 ipt_MASQUERADE 2022-11-23T01:16:53.6998666Z nf_conntrack_netlink 49152 0 2022-11-23T01:16:53.6998966Z nfnetlink 16384 2 nf_conntrack_netlink 2022-11-23T01:16:53.6999279Z xfrm_user 45056 1 2022-11-23T01:16:53.6999572Z xfrm_algo 16384 1 xfrm_user 2022-11-23T01:16:53.6999884Z xt_addrtype 16384 2 2022-11-23T01:16:53.7000153Z iptable_filter 16384 1 2022-11-23T01:16:53.7000441Z iptable_nat 16384 1 2022-11-23T01:16:53.7006200Z nf_conntrack_ipv4 16384 3 2022-11-23T01:16:53.7006667Z nf_defrag_ipv4 16384 1 nf_conntrack_ipv4 2022-11-23T01:16:53.7007026Z nf_nat_ipv4 16384 1 iptable_nat 2022-11-23T01:16:53.7007624Z nf_nat 36864 2 nf_nat_masquerade_ipv4,nf_nat_ipv4 2022-11-23T01:16:53.7008444Z nf_conntrack 155648 7 xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_ipv4,nf_nat,ipt_MASQUERADE,nf_nat_ipv4,nf_conntrack_netlink 2022-11-23T01:16:53.7008907Z br_netfilter 24576 0 2022-11-23T01:16:53.7009216Z bridge 172032 1 br_netfilter 2022-11-23T01:16:53.7009494Z stp 16384 1 bridge 2022-11-23T01:16:53.7009816Z llc 16384 2 bridge,stp 2022-11-23T01:16:53.7010114Z overlay 86016 0 2022-11-23T01:16:53.7010397Z sunrpc 393216 1 2022-11-23T01:16:53.7010667Z dm_mirror 28672 0 2022-11-23T01:16:53.7010963Z dm_region_hash 20480 1 dm_mirror 2022-11-23T01:16:53.7011304Z dm_log 20480 2 dm_region_hash,dm_mirror 2022-11-23T01:16:53.7011615Z dm_mod 143360 2 dm_log,dm_mirror 2022-11-23T01:16:53.7011917Z dax 69632 1 dm_mod 2022-11-23T01:16:53.7012197Z sb_edac 24576 0 2022-11-23T01:16:53.7012447Z crc32_pclmul 16384 0 2022-11-23T01:16:53.7012738Z ghash_clmulni_intel 16384 0 2022-11-23T01:16:53.7013259Z pcbc 16384 0 2022-11-23T01:16:53.7013589Z ata_piix 36864 0 2022-11-23T01:16:53.7013872Z aesni_intel 188416 0 2022-11-23T01:16:53.7014167Z libata 266240 1 ata_piix 2022-11-23T01:16:53.7014444Z aes_x86_64 20480 1 aesni_intel 2022-11-23T01:16:53.7014753Z crypto_simd 16384 1 aesni_intel 2022-11-23T01:16:53.7015046Z mousedev 24576 0 2022-11-23T01:16:53.7015323Z glue_helper 16384 1 aesni_intel 2022-11-23T01:16:53.7015629Z pcc_cpufreq 16384 0 2022-11-23T01:16:53.7015987Z cryptd 28672 3 crypto_simd,ghash_clmulni_intel,aesni_intel 2022-11-23T01:16:53.7016342Z scsi_mod 245760 1 libata 2022-11-23T01:16:53.7016601Z psmouse 32768 0 2022-11-23T01:16:53.7016880Z evdev 20480 3 2022-11-23T01:16:53.7017166Z button 16384 0 2022-11-23T01:16:53.7017408Z ena 114688 0 2022-11-23T01:16:53.7017680Z xen_blkfront 49152 2 2022-11-23T01:16:53.7017961Z crc32c_intel 24576 0 2022-11-23T01:16:53.7018214Z autofs4 49152 2 2022-11-23T01:16:53.7018484Z + modinfo nvidia 2022-11-23T01:16:53.7018792Z modinfo: ERROR: Module nvidia not found. 2022-11-23T01:16:53.7019054Z + true 2022-11-23T01:16:53.7019306Z + HAS_NVIDIA_DRIVER=0 2022-11-23T01:16:53.7019726Z ++ command -v nvidia-smi 2022-11-23T01:16:53.7020641Z + '[' -x '' ']' 2022-11-23T01:16:53.7021228Z + '[' 0 -eq 0 ']' 2022-11-23T01:16:53.7021605Z + sudo yum groupinstall -y 'Development Tools' 2022-11-23T01:16:54.1753407Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:16:54.4780469Z Resolving Dependencies 2022-11-23T01:16:54.4785600Z --> Running transaction check 2022-11-23T01:16:54.4788596Z ---> Package autoconf.noarch 0:2.69-11.amzn2 will be installed 2022-11-23T01:16:54.5002014Z --> Processing Dependency: m4 >= 1.4.14 for package: autoconf-2.69-11.amzn2.noarch 2022-11-23T01:16:54.7073776Z --> Processing Dependency: perl(Data::Dumper) for package: autoconf-2.69-11.amzn2.noarch 2022-11-23T01:16:54.7074804Z ---> Package automake.noarch 0:1.13.4-3.1.amzn2 will be installed 2022-11-23T01:16:54.7120172Z --> Processing Dependency: perl(Thread::Queue) for package: automake-1.13.4-3.1.amzn2.noarch 2022-11-23T01:16:54.7126878Z --> Processing Dependency: perl(TAP::Parser) for package: automake-1.13.4-3.1.amzn2.noarch 2022-11-23T01:16:54.7137784Z ---> Package bison.x86_64 0:3.0.4-6.amzn2.0.2 will be installed 2022-11-23T01:16:54.7252875Z ---> Package byacc.x86_64 0:1.9.20130304-3.amzn2.0.2 will be installed 2022-11-23T01:16:54.7259904Z ---> Package cscope.x86_64 0:15.8-10.amzn2.0.2 will be installed 2022-11-23T01:16:54.7303786Z --> Processing Dependency: emacs-filesystem for package: cscope-15.8-10.amzn2.0.2.x86_64 2022-11-23T01:16:54.7327543Z ---> Package ctags.x86_64 0:5.8-13.amzn2.0.2 will be installed 2022-11-23T01:16:54.7336242Z ---> Package diffstat.x86_64 0:1.57-4.amzn2.0.2 will be installed 2022-11-23T01:16:54.7343918Z ---> Package doxygen.x86_64 1:1.8.5-4.amzn2 will be installed 2022-11-23T01:16:54.7444423Z ---> Package elfutils.x86_64 0:0.176-2.amzn2 will be installed 2022-11-23T01:16:54.7577488Z ---> Package flex.x86_64 0:2.5.37-3.amzn2.0.3 will be installed 2022-11-23T01:16:54.7596363Z ---> Package gcc.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:16:54.7768102Z --> Processing Dependency: cpp = 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:16:54.7787446Z --> Processing Dependency: libsanitizer >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:16:54.7842027Z --> Processing Dependency: libquadmath >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:16:54.7895187Z --> Processing Dependency: libmpx >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:16:54.7950495Z --> Processing Dependency: libitm >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:16:54.8002747Z --> Processing Dependency: libcilkrts >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:16:54.8055423Z --> Processing Dependency: libatomic >= 7.3.1-15.amzn2 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:16:54.8109096Z --> Processing Dependency: glibc-devel >= 2.2.90-12 for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:16:54.8270776Z --> Processing Dependency: libmpfr.so.4()(64bit) for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:16:54.8291400Z --> Processing Dependency: libmpc.so.3()(64bit) for package: gcc-7.3.1-15.amzn2.x86_64 2022-11-23T01:16:54.8313066Z ---> Package gcc-c++.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:16:54.8339425Z ---> Package gcc-gfortran.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:16:54.8373137Z --> Processing Dependency: libgfortran.so.4()(64bit) for package: gcc-gfortran-7.3.1-15.amzn2.x86_64 2022-11-23T01:16:54.8434641Z ---> Package indent.x86_64 0:2.2.11-13.amzn2.0.2 will be installed 2022-11-23T01:16:54.8448695Z ---> Package intltool.noarch 0:0.50.2-7.amzn2 will be installed 2022-11-23T01:16:54.8498887Z --> Processing Dependency: perl(XML::Parser) for package: intltool-0.50.2-7.amzn2.noarch 2022-11-23T01:16:54.8514204Z --> Processing Dependency: gettext-devel for package: intltool-0.50.2-7.amzn2.noarch 2022-11-23T01:16:54.8532376Z ---> Package libtool.x86_64 0:2.4.2-22.2.amzn2.0.2 will be installed 2022-11-23T01:16:54.8561452Z ---> Package patch.x86_64 0:2.7.1-12.amzn2.0.2 will be installed 2022-11-23T01:16:54.8597984Z ---> Package patchutils.x86_64 0:0.3.3-4.amzn2.0.1 will be installed 2022-11-23T01:16:54.8622371Z ---> Package rcs.x86_64 0:5.9.0-5.amzn2.0.2 will be installed 2022-11-23T01:16:54.8654455Z ---> Package rpm-build.x86_64 0:4.11.3-48.amzn2.0.2 will be installed 2022-11-23T01:16:54.8891116Z --> Processing Dependency: /usr/bin/gdb-add-index for package: rpm-build-4.11.3-48.amzn2.0.2.x86_64 2022-11-23T01:16:54.8908326Z ---> Package rpm-sign.x86_64 0:4.11.3-48.amzn2.0.2 will be installed 2022-11-23T01:16:54.8931424Z ---> Package subversion.x86_64 0:1.7.14-16.amzn2.0.1 will be installed 2022-11-23T01:16:54.9099912Z --> Processing Dependency: subversion-libs(x86-64) = 1.7.14-16.amzn2.0.1 for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9119136Z --> Processing Dependency: libsvn_wc-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9120184Z --> Processing Dependency: libsvn_subr-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9120836Z --> Processing Dependency: libsvn_repos-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9121969Z --> Processing Dependency: libsvn_ra_svn-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9122647Z --> Processing Dependency: libsvn_ra_neon-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9123308Z --> Processing Dependency: libsvn_ra_local-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9123945Z --> Processing Dependency: libsvn_ra-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9124585Z --> Processing Dependency: libsvn_fs_util-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9125200Z --> Processing Dependency: libsvn_fs_fs-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9125834Z --> Processing Dependency: libsvn_fs_base-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9126471Z --> Processing Dependency: libsvn_fs-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9127104Z --> Processing Dependency: libsvn_diff-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9127731Z --> Processing Dependency: libsvn_delta-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9128379Z --> Processing Dependency: libsvn_client-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9129001Z --> Processing Dependency: libneon.so.27()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9147912Z --> Processing Dependency: libaprutil-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9168225Z --> Processing Dependency: libapr-1.so.0()(64bit) for package: subversion-1.7.14-16.amzn2.0.1.x86_64 2022-11-23T01:16:54.9192738Z ---> Package swig.x86_64 0:3.0.12-11.amzn2.0.3 will be installed 2022-11-23T01:16:54.9213155Z ---> Package system-rpm-config.noarch 0:9.1.0-76.amzn2.0.14 will be installed 2022-11-23T01:16:54.9259335Z --> Processing Dependency: dwz >= 0.4 for package: system-rpm-config-9.1.0-76.amzn2.0.14.noarch 2022-11-23T01:16:54.9276551Z --> Processing Dependency: perl-srpm-macros for package: system-rpm-config-9.1.0-76.amzn2.0.14.noarch 2022-11-23T01:16:54.9288652Z --> Processing Dependency: go-srpm-macros for package: system-rpm-config-9.1.0-76.amzn2.0.14.noarch 2022-11-23T01:16:54.9464151Z ---> Package systemtap.x86_64 0:4.5-1.amzn2.0.1 will be installed 2022-11-23T01:16:54.9477197Z --> Processing Dependency: systemtap-devel = 4.5-1.amzn2.0.1 for package: systemtap-4.5-1.amzn2.0.1.x86_64 2022-11-23T01:16:54.9492264Z --> Processing Dependency: systemtap-client = 4.5-1.amzn2.0.1 for package: systemtap-4.5-1.amzn2.0.1.x86_64 2022-11-23T01:16:54.9505723Z --> Running transaction check 2022-11-23T01:16:54.9509106Z ---> Package apr.x86_64 0:1.7.0-9.amzn2 will be installed 2022-11-23T01:16:54.9588652Z ---> Package apr-util.x86_64 0:1.6.1-5.amzn2.0.2 will be installed 2022-11-23T01:16:54.9626747Z --> Processing Dependency: apr-util-bdb(x86-64) = 1.6.1-5.amzn2.0.2 for package: apr-util-1.6.1-5.amzn2.0.2.x86_64 2022-11-23T01:16:54.9640771Z ---> Package cpp.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:16:54.9713634Z ---> Package dwz.x86_64 0:0.11-3.amzn2.0.3 will be installed 2022-11-23T01:16:54.9724056Z ---> Package emacs-filesystem.noarch 1:27.2-4.amzn2.0.1 will be installed 2022-11-23T01:16:54.9725078Z ---> Package gdb.x86_64 0:8.0.1-36.amzn2.0.1 will be installed 2022-11-23T01:16:54.9794129Z ---> Package gettext-devel.x86_64 0:0.19.8.1-3.amzn2 will be installed 2022-11-23T01:16:54.9851529Z --> Processing Dependency: gettext-common-devel = 0.19.8.1-3.amzn2 for package: gettext-devel-0.19.8.1-3.amzn2.x86_64 2022-11-23T01:16:54.9860155Z ---> Package glibc-devel.x86_64 0:2.26-62.amzn2 will be installed 2022-11-23T01:16:54.9980751Z --> Processing Dependency: glibc-headers = 2.26-62.amzn2 for package: glibc-devel-2.26-62.amzn2.x86_64 2022-11-23T01:16:55.0008634Z --> Processing Dependency: glibc-headers for package: glibc-devel-2.26-62.amzn2.x86_64 2022-11-23T01:16:55.0009432Z ---> Package go-srpm-macros.noarch 0:3.0.15-23.amzn2.0.2 will be installed 2022-11-23T01:16:55.0014397Z ---> Package libatomic.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:16:55.0027441Z ---> Package libcilkrts.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:16:55.0054063Z ---> Package libgfortran.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:16:55.0089025Z ---> Package libitm.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:16:55.0104561Z ---> Package libmpc.x86_64 0:1.0.1-3.amzn2.0.2 will be installed 2022-11-23T01:16:55.0117077Z ---> Package libmpx.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:16:55.0131664Z ---> Package libquadmath.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:16:55.0156697Z ---> Package libsanitizer.x86_64 0:7.3.1-15.amzn2 will be installed 2022-11-23T01:16:55.0201933Z ---> Package m4.x86_64 0:1.4.16-10.amzn2.0.2 will be installed 2022-11-23T01:16:55.0216561Z ---> Package mpfr.x86_64 0:3.1.1-4.amzn2.0.2 will be installed 2022-11-23T01:16:55.0237335Z ---> Package neon.x86_64 0:0.30.0-3.amzn2.0.2 will be installed 2022-11-23T01:16:55.0311349Z --> Processing Dependency: libgnutls.so.28(GNUTLS_2_12)(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-11-23T01:16:55.0349665Z --> Processing Dependency: libgnutls.so.28(GNUTLS_1_4)(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-11-23T01:16:55.0350994Z --> Processing Dependency: libproxy.so.1()(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-11-23T01:16:55.0369793Z --> Processing Dependency: libpakchois.so.0()(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-11-23T01:16:55.0386765Z --> Processing Dependency: libgnutls.so.28()(64bit) for package: neon-0.30.0-3.amzn2.0.2.x86_64 2022-11-23T01:16:55.0393340Z ---> Package perl-Data-Dumper.x86_64 0:2.145-3.amzn2.0.2 will be installed 2022-11-23T01:16:55.0440265Z ---> Package perl-Test-Harness.noarch 0:3.28-3.amzn2 will be installed 2022-11-23T01:16:55.0532499Z ---> Package perl-Thread-Queue.noarch 0:3.02-2.amzn2 will be installed 2022-11-23T01:16:55.0544353Z ---> Package perl-XML-Parser.x86_64 0:2.41-10.amzn2.0.2 will be installed 2022-11-23T01:16:55.0560156Z ---> Package perl-srpm-macros.noarch 0:1-8.amzn2.0.1 will be installed 2022-11-23T01:16:55.0561274Z ---> Package subversion-libs.x86_64 0:1.7.14-16.amzn2.0.1 will be installed 2022-11-23T01:16:55.0588726Z ---> Package systemtap-client.x86_64 0:4.5-1.amzn2.0.1 will be installed 2022-11-23T01:16:55.0792662Z --> Processing Dependency: mokutil for package: systemtap-client-4.5-1.amzn2.0.1.x86_64 2022-11-23T01:16:55.0805919Z --> Processing Dependency: libavahi-common.so.3()(64bit) for package: systemtap-client-4.5-1.amzn2.0.1.x86_64 2022-11-23T01:16:55.0832063Z --> Processing Dependency: libavahi-client.so.3()(64bit) for package: systemtap-client-4.5-1.amzn2.0.1.x86_64 2022-11-23T01:16:55.0832812Z ---> Package systemtap-devel.x86_64 0:4.5-1.amzn2.0.1 will be installed 2022-11-23T01:16:55.0945364Z --> Processing Dependency: kernel-devel-uname-r for package: systemtap-devel-4.5-1.amzn2.0.1.x86_64 2022-11-23T01:16:55.1964006Z --> Running transaction check 2022-11-23T01:16:55.1964922Z ---> Package apr-util-bdb.x86_64 0:1.6.1-5.amzn2.0.2 will be installed 2022-11-23T01:16:55.1974446Z ---> Package avahi-libs.x86_64 0:0.6.31-20.amzn2 will be installed 2022-11-23T01:16:55.1999889Z ---> Package gettext-common-devel.noarch 0:0.19.8.1-3.amzn2 will be installed 2022-11-23T01:16:55.2000853Z ---> Package glibc-headers.x86_64 0:2.26-62.amzn2 will be installed 2022-11-23T01:16:55.2074475Z --> Processing Dependency: kernel-headers >= 2.2.1 for package: glibc-headers-2.26-62.amzn2.x86_64 2022-11-23T01:16:55.3168410Z --> Processing Dependency: kernel-headers for package: glibc-headers-2.26-62.amzn2.x86_64 2022-11-23T01:16:55.3169142Z ---> Package gnutls.x86_64 0:3.3.29-9.amzn2.0.1 will be installed 2022-11-23T01:16:55.3233899Z --> Processing Dependency: trousers >= 0.3.11.2 for package: gnutls-3.3.29-9.amzn2.0.1.x86_64 2022-11-23T01:16:55.3259719Z ---> Package kernel-devel.x86_64 0:4.14.296-222.539.amzn2 will be installed 2022-11-23T01:16:55.3285947Z --> Processing Dependency: elfutils-libelf-devel for package: kernel-devel-4.14.296-222.539.amzn2.x86_64 2022-11-23T01:16:55.3305199Z ---> Package libproxy.x86_64 0:0.4.11-10.amzn2.0.3 will be installed 2022-11-23T01:16:55.3333117Z --> Processing Dependency: libmodman.so.1()(64bit) for package: libproxy-0.4.11-10.amzn2.0.3.x86_64 2022-11-23T01:16:55.3350968Z ---> Package mokutil.x86_64 1:0.3.0-10.amzn2.0.1 will be installed 2022-11-23T01:16:55.3398595Z --> Processing Dependency: libefivar.so.1(libefivar.so.0)(64bit) for package: 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 2022-11-23T01:16:55.3418418Z --> Processing Dependency: libefivar.so.1(LIBEFIVAR_0.24)(64bit) for package: 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 2022-11-23T01:16:55.3419629Z --> Processing Dependency: libefivar.so.1()(64bit) for package: 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 2022-11-23T01:16:55.3420328Z ---> Package pakchois.x86_64 0:0.4-10.amzn2.0.2 will be installed 2022-11-23T01:16:55.3433767Z --> Running transaction check 2022-11-23T01:16:55.3434302Z ---> Package efivar-libs.x86_64 0:31-4.amzn2.0.4 will be installed 2022-11-23T01:16:55.3451505Z ---> Package elfutils-libelf-devel.x86_64 0:0.176-2.amzn2 will be installed 2022-11-23T01:16:55.3463586Z --> Processing Dependency: pkgconfig(zlib) for package: elfutils-libelf-devel-0.176-2.amzn2.x86_64 2022-11-23T01:16:55.3491401Z ---> Package kernel-headers.x86_64 0:4.14.296-222.539.amzn2 will be installed 2022-11-23T01:16:55.3492328Z ---> Package libmodman.x86_64 0:2.0.1-8.amzn2.0.2 will be installed 2022-11-23T01:16:55.3509683Z ---> Package trousers.x86_64 0:0.3.14-2.amzn2.0.2 will be installed 2022-11-23T01:16:55.3565817Z --> Running transaction check 2022-11-23T01:16:55.3566404Z ---> Package zlib-devel.x86_64 0:1.2.7-19.amzn2.0.2 will be installed 2022-11-23T01:16:55.6148242Z --> Finished Dependency Resolution 2022-11-23T01:16:55.6890638Z 2022-11-23T01:16:55.6890825Z Dependencies Resolved 2022-11-23T01:16:55.7003122Z 2022-11-23T01:16:55.7003406Z ================================================================================ 2022-11-23T01:16:55.7003986Z Package Arch Version Repository Size 2022-11-23T01:16:55.7004352Z ================================================================================ 2022-11-23T01:16:55.7004697Z Installing for group install "Development Tools": 2022-11-23T01:16:55.7005356Z autoconf noarch 2.69-11.amzn2 amzn2-core 701 k 2022-11-23T01:16:55.7005797Z automake noarch 1.13.4-3.1.amzn2 amzn2-core 679 k 2022-11-23T01:16:55.7006249Z bison x86_64 3.0.4-6.amzn2.0.2 amzn2-core 674 k 2022-11-23T01:16:55.7006691Z byacc x86_64 1.9.20130304-3.amzn2.0.2 amzn2-core 66 k 2022-11-23T01:16:55.7007305Z cscope x86_64 15.8-10.amzn2.0.2 amzn2-core 204 k 2022-11-23T01:16:55.7007866Z ctags x86_64 5.8-13.amzn2.0.2 amzn2-core 157 k 2022-11-23T01:16:55.7008327Z diffstat x86_64 1.57-4.amzn2.0.2 amzn2-core 35 k 2022-11-23T01:16:55.7008773Z doxygen x86_64 1:1.8.5-4.amzn2 amzn2-core 3.5 M 2022-11-23T01:16:55.7009191Z elfutils x86_64 0.176-2.amzn2 amzn2-core 307 k 2022-11-23T01:16:55.7009617Z flex x86_64 2.5.37-3.amzn2.0.3 amzn2-core 291 k 2022-11-23T01:16:55.7010043Z gcc x86_64 7.3.1-15.amzn2 amzn2-core 22 M 2022-11-23T01:16:55.7010477Z gcc-c++ x86_64 7.3.1-15.amzn2 amzn2-core 13 M 2022-11-23T01:16:55.7010902Z gcc-gfortran x86_64 7.3.1-15.amzn2 amzn2-core 11 M 2022-11-23T01:16:55.7011355Z indent x86_64 2.2.11-13.amzn2.0.2 amzn2-core 150 k 2022-11-23T01:16:55.7011790Z intltool noarch 0.50.2-7.amzn2 amzn2-core 59 k 2022-11-23T01:16:55.7012214Z libtool x86_64 2.4.2-22.2.amzn2.0.2 amzn2-core 588 k 2022-11-23T01:16:55.7012648Z patch x86_64 2.7.1-12.amzn2.0.2 amzn2-core 110 k 2022-11-23T01:16:55.7013086Z patchutils x86_64 0.3.3-4.amzn2.0.1 amzn2-core 104 k 2022-11-23T01:16:55.7013519Z rcs x86_64 5.9.0-5.amzn2.0.2 amzn2-core 231 k 2022-11-23T01:16:55.7013933Z rpm-build x86_64 4.11.3-48.amzn2.0.2 amzn2-core 150 k 2022-11-23T01:16:55.7014371Z rpm-sign x86_64 4.11.3-48.amzn2.0.2 amzn2-core 50 k 2022-11-23T01:16:55.7014810Z subversion x86_64 1.7.14-16.amzn2.0.1 amzn2-core 1.0 M 2022-11-23T01:16:55.7015227Z swig x86_64 3.0.12-11.amzn2.0.3 amzn2-core 1.4 M 2022-11-23T01:16:55.7015680Z system-rpm-config noarch 9.1.0-76.amzn2.0.14 amzn2-core 90 k 2022-11-23T01:16:55.7016142Z systemtap x86_64 4.5-1.amzn2.0.1 amzn2-core 12 k 2022-11-23T01:16:55.7016463Z Installing for dependencies: 2022-11-23T01:16:55.7016860Z apr x86_64 1.7.0-9.amzn2 amzn2-core 122 k 2022-11-23T01:16:55.7017287Z apr-util x86_64 1.6.1-5.amzn2.0.2 amzn2-core 99 k 2022-11-23T01:16:55.7017738Z apr-util-bdb x86_64 1.6.1-5.amzn2.0.2 amzn2-core 19 k 2022-11-23T01:16:55.7018165Z avahi-libs x86_64 0.6.31-20.amzn2 amzn2-core 61 k 2022-11-23T01:16:55.7018594Z cpp x86_64 7.3.1-15.amzn2 amzn2-core 9.2 M 2022-11-23T01:16:55.7019017Z dwz x86_64 0.11-3.amzn2.0.3 amzn2-core 98 k 2022-11-23T01:16:55.7019432Z efivar-libs x86_64 31-4.amzn2.0.4 amzn2-core 68 k 2022-11-23T01:16:55.7019901Z elfutils-libelf-devel x86_64 0.176-2.amzn2 amzn2-core 40 k 2022-11-23T01:16:55.7020382Z emacs-filesystem noarch 1:27.2-4.amzn2.0.1 amzn2-core 67 k 2022-11-23T01:16:55.7020827Z gdb x86_64 8.0.1-36.amzn2.0.1 amzn2-core 3.1 M 2022-11-23T01:16:55.7021268Z gettext-common-devel noarch 0.19.8.1-3.amzn2 amzn2-core 410 k 2022-11-23T01:16:55.7021738Z gettext-devel x86_64 0.19.8.1-3.amzn2 amzn2-core 320 k 2022-11-23T01:16:55.7022189Z glibc-devel x86_64 2.26-62.amzn2 amzn2-core 995 k 2022-11-23T01:16:55.7022620Z glibc-headers x86_64 2.26-62.amzn2 amzn2-core 516 k 2022-11-23T01:16:55.7023063Z gnutls x86_64 3.3.29-9.amzn2.0.1 amzn2-core 661 k 2022-11-23T01:16:55.7023510Z go-srpm-macros noarch 3.0.15-23.amzn2.0.2 amzn2-core 23 k 2022-11-23T01:16:55.7024044Z kernel-devel x86_64 4.14.296-222.539.amzn2 amzn2-core 13 M 2022-11-23T01:16:55.7024535Z kernel-headers x86_64 4.14.296-222.539.amzn2 amzn2-core 1.2 M 2022-11-23T01:16:55.7024993Z libatomic x86_64 7.3.1-15.amzn2 amzn2-core 46 k 2022-11-23T01:16:55.7025432Z libcilkrts x86_64 7.3.1-15.amzn2 amzn2-core 85 k 2022-11-23T01:16:55.7025853Z libgfortran x86_64 7.3.1-15.amzn2 amzn2-core 536 k 2022-11-23T01:16:55.7026288Z libitm x86_64 7.3.1-15.amzn2 amzn2-core 85 k 2022-11-23T01:16:55.7026745Z libmodman x86_64 2.0.1-8.amzn2.0.2 amzn2-core 29 k 2022-11-23T01:16:55.7027180Z libmpc x86_64 1.0.1-3.amzn2.0.2 amzn2-core 52 k 2022-11-23T01:16:55.7027593Z libmpx x86_64 7.3.1-15.amzn2 amzn2-core 51 k 2022-11-23T01:16:55.7028028Z libproxy x86_64 0.4.11-10.amzn2.0.3 amzn2-core 61 k 2022-11-23T01:16:55.7028472Z libquadmath x86_64 7.3.1-15.amzn2 amzn2-core 189 k 2022-11-23T01:16:55.7028914Z libsanitizer x86_64 7.3.1-15.amzn2 amzn2-core 642 k 2022-11-23T01:16:55.7029878Z m4 x86_64 1.4.16-10.amzn2.0.2 amzn2-core 256 k 2022-11-23T01:16:55.7030309Z mokutil x86_64 1:0.3.0-10.amzn2.0.1 amzn2-core 39 k 2022-11-23T01:16:55.7030739Z mpfr x86_64 3.1.1-4.amzn2.0.2 amzn2-core 208 k 2022-11-23T01:16:55.7031149Z neon x86_64 0.30.0-3.amzn2.0.2 amzn2-core 166 k 2022-11-23T01:16:55.7031579Z pakchois x86_64 0.4-10.amzn2.0.2 amzn2-core 14 k 2022-11-23T01:16:55.7032039Z perl-Data-Dumper x86_64 2.145-3.amzn2.0.2 amzn2-core 48 k 2022-11-23T01:16:55.7032521Z perl-Test-Harness noarch 3.28-3.amzn2 amzn2-core 302 k 2022-11-23T01:16:55.7032983Z perl-Thread-Queue noarch 3.02-2.amzn2 amzn2-core 17 k 2022-11-23T01:16:55.7033462Z perl-XML-Parser x86_64 2.41-10.amzn2.0.2 amzn2-core 223 k 2022-11-23T01:16:55.7033935Z perl-srpm-macros noarch 1-8.amzn2.0.1 amzn2-core 4.7 k 2022-11-23T01:16:55.7034389Z subversion-libs x86_64 1.7.14-16.amzn2.0.1 amzn2-core 912 k 2022-11-23T01:16:55.7034848Z systemtap-client x86_64 4.5-1.amzn2.0.1 amzn2-core 3.7 M 2022-11-23T01:16:55.7035309Z systemtap-devel x86_64 4.5-1.amzn2.0.1 amzn2-core 2.3 M 2022-11-23T01:16:55.7035754Z trousers x86_64 0.3.14-2.amzn2.0.2 amzn2-core 294 k 2022-11-23T01:16:55.7036172Z zlib-devel x86_64 1.2.7-19.amzn2.0.2 amzn2-core 50 k 2022-11-23T01:16:55.7036384Z 2022-11-23T01:16:55.7036500Z Transaction Summary 2022-11-23T01:16:55.7036794Z ================================================================================ 2022-11-23T01:16:55.7037101Z Install 25 Packages (+43 Dependent packages) 2022-11-23T01:16:55.7037303Z 2022-11-23T01:16:55.7037425Z Total download size: 96 M 2022-11-23T01:16:55.7037689Z Installed size: 303 M 2022-11-23T01:16:55.7037949Z Downloading packages: 2022-11-23T01:16:55.7058608Z Delta RPMs disabled because /usr/bin/applydeltarpm not installed. 2022-11-23T01:16:56.9486586Z -------------------------------------------------------------------------------- 2022-11-23T01:16:56.9487395Z Total 77 MB/s | 96 MB 00:01 2022-11-23T01:16:57.0540992Z Running transaction check 2022-11-23T01:16:57.1300302Z Running transaction test 2022-11-23T01:16:59.4979978Z Transaction test succeeded 2022-11-23T01:16:59.4984163Z Running transaction 2022-11-23T01:17:04.8369841Z Installing : mpfr-3.1.1-4.amzn2.0.2.x86_64 1/68 2022-11-23T01:17:07.3283176Z Installing : libmpc-1.0.1-3.amzn2.0.2.x86_64 2/68 2022-11-23T01:17:09.7643327Z Installing : m4-1.4.16-10.amzn2.0.2.x86_64 3/68 2022-11-23T01:17:12.2542808Z Installing : apr-1.7.0-9.amzn2.x86_64 4/68 2022-11-23T01:17:14.6742799Z Installing : apr-util-bdb-1.6.1-5.amzn2.0.2.x86_64 5/68 2022-11-23T01:17:16.6869579Z Installing : apr-util-1.6.1-5.amzn2.0.2.x86_64 6/68 2022-11-23T01:17:16.7374169Z Installing : avahi-libs-0.6.31-20.amzn2.x86_64 7/68 2022-11-23T01:17:16.7820197Z Installing : libquadmath-7.3.1-15.amzn2.x86_64 8/68 2022-11-23T01:17:16.8081165Z Installing : patch-2.7.1-12.amzn2.0.2.x86_64 9/68 2022-11-23T01:17:16.8948407Z Installing : perl-Thread-Queue-3.02-2.amzn2.noarch 10/68 2022-11-23T01:17:17.9555975Z Installing : libgfortran-7.3.1-15.amzn2.x86_64 11/68 2022-11-23T01:17:17.9920974Z Installing : cpp-7.3.1-15.amzn2.x86_64 12/68 2022-11-23T01:17:18.0253905Z Installing : libmodman-2.0.1-8.amzn2.0.2.x86_64 13/68 2022-11-23T01:17:18.0855625Z Installing : libproxy-0.4.11-10.amzn2.0.3.x86_64 14/68 2022-11-23T01:17:18.1529454Z Installing : perl-XML-Parser-2.41-10.amzn2.0.2.x86_64 15/68 2022-11-23T01:17:18.2917767Z Installing : elfutils-0.176-2.amzn2.x86_64 16/68 2022-11-23T01:17:18.3251387Z Installing : libsanitizer-7.3.1-15.amzn2.x86_64 17/68 2022-11-23T01:17:18.3524112Z Installing : 1:emacs-filesystem-27.2-4.amzn2.0.1.noarch 18/68 2022-11-23T01:17:18.3871907Z Installing : efivar-libs-31-4.amzn2.0.4.x86_64 19/68 2022-11-23T01:17:18.4158797Z Installing : 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 20/68 2022-11-23T01:17:18.5099359Z Installing : gettext-common-devel-0.19.8.1-3.amzn2.noarch 21/68 2022-11-23T01:17:18.5610466Z Installing : gettext-devel-0.19.8.1-3.amzn2.x86_64 22/68 2022-11-23T01:17:18.6507745Z Installing : dwz-0.11-3.amzn2.0.3.x86_64 23/68 2022-11-23T01:17:18.8306892Z Installing : trousers-0.3.14-2.amzn2.0.2.x86_64 24/68 2022-11-23T01:17:18.8719339Z Installing : gnutls-3.3.29-9.amzn2.0.1.x86_64 25/68 2022-11-23T01:17:19.2666841Z Installing : libitm-7.3.1-15.amzn2.x86_64 26/68 2022-11-23T01:17:19.2986448Z Installing : gdb-8.0.1-36.amzn2.0.1.x86_64 27/68 2022-11-23T01:17:19.3312652Z Installing : libmpx-7.3.1-15.amzn2.x86_64 28/68 2022-11-23T01:17:19.3508850Z Installing : perl-srpm-macros-1-8.amzn2.0.1.noarch 29/68 2022-11-23T01:17:19.3824602Z Installing : go-srpm-macros-3.0.15-23.amzn2.0.2.noarch 30/68 2022-11-23T01:17:19.4118020Z Installing : system-rpm-config-9.1.0-76.amzn2.0.14.noarch 31/68 2022-11-23T01:17:19.5040378Z Installing : perl-Data-Dumper-2.145-3.amzn2.0.2.x86_64 32/68 2022-11-23T01:17:19.6087153Z Installing : autoconf-2.69-11.amzn2.noarch 33/68 2022-11-23T01:17:19.7225080Z Installing : perl-Test-Harness-3.28-3.amzn2.noarch 34/68 2022-11-23T01:17:19.7650004Z Installing : automake-1.13.4-3.1.amzn2.noarch 35/68 2022-11-23T01:17:19.7903094Z Installing : zlib-devel-1.2.7-19.amzn2.0.2.x86_64 36/68 2022-11-23T01:17:19.8114879Z Installing : elfutils-libelf-devel-0.176-2.amzn2.x86_64 37/68 2022-11-23T01:17:20.1029447Z Installing : libatomic-7.3.1-15.amzn2.x86_64 38/68 2022-11-23T01:17:20.2716947Z Installing : kernel-headers-4.14.296-222.539.amzn2.x86_64 39/68 2022-11-23T01:17:20.4094258Z Installing : glibc-headers-2.26-62.amzn2.x86_64 40/68 2022-11-23T01:17:20.4457613Z Installing : glibc-devel-2.26-62.amzn2.x86_64 41/68 2022-11-23T01:17:22.5936986Z Installing : libcilkrts-7.3.1-15.amzn2.x86_64 42/68 2022-11-23T01:17:26.4741332Z Installing : gcc-7.3.1-15.amzn2.x86_64 43/68 2022-11-23T01:17:37.8230341Z Installing : kernel-devel-4.14.296-222.539.amzn2.x86_64 44/68 2022-11-23T01:17:38.4349592Z Installing : systemtap-devel-4.5-1.amzn2.0.1.x86_64 45/68 2022-11-23T01:17:38.4923436Z Installing : systemtap-client-4.5-1.amzn2.0.1.x86_64 46/68 2022-11-23T01:17:38.5479837Z Installing : pakchois-0.4-10.amzn2.0.2.x86_64 47/68 2022-11-23T01:17:38.6816706Z Installing : neon-0.30.0-3.amzn2.0.2.x86_64 48/68 2022-11-23T01:17:38.8620073Z Installing : subversion-libs-1.7.14-16.amzn2.0.1.x86_64 49/68 2022-11-23T01:17:38.9671025Z Installing : subversion-1.7.14-16.amzn2.0.1.x86_64 50/68 2022-11-23T01:17:40.1816962Z Installing : systemtap-4.5-1.amzn2.0.1.x86_64 51/68 2022-11-23T01:17:41.8205567Z Installing : gcc-gfortran-7.3.1-15.amzn2.x86_64 52/68 2022-11-23T01:17:41.9395650Z Installing : gcc-c++-7.3.1-15.amzn2.x86_64 53/68 2022-11-23T01:17:41.9794613Z Installing : libtool-2.4.2-22.2.amzn2.0.2.x86_64 54/68 2022-11-23T01:17:42.0186621Z Installing : intltool-0.50.2-7.amzn2.noarch 55/68 2022-11-23T01:17:42.0773975Z Installing : rpm-build-4.11.3-48.amzn2.0.2.x86_64 56/68 2022-11-23T01:17:42.1400951Z Installing : cscope-15.8-10.amzn2.0.2.x86_64 57/68 2022-11-23T01:17:42.2442732Z Installing : flex-2.5.37-3.amzn2.0.3.x86_64 58/68 2022-11-23T01:17:42.3073996Z Installing : bison-3.0.4-6.amzn2.0.2.x86_64 59/68 2022-11-23T01:17:42.3530967Z Installing : rcs-5.9.0-5.amzn2.0.2.x86_64 60/68 2022-11-23T01:17:42.3921012Z Installing : ctags-5.8-13.amzn2.0.2.x86_64 61/68 2022-11-23T01:17:42.4375951Z Installing : indent-2.2.11-13.amzn2.0.2.x86_64 62/68 2022-11-23T01:17:43.1161216Z Installing : patchutils-0.3.3-4.amzn2.0.1.x86_64 63/68 2022-11-23T01:17:43.1627216Z Installing : 1:doxygen-1.8.5-4.amzn2.x86_64 64/68 2022-11-23T01:17:43.1928112Z Installing : diffstat-1.57-4.amzn2.0.2.x86_64 65/68 2022-11-23T01:17:43.5174610Z Installing : byacc-1.9.20130304-3.amzn2.0.2.x86_64 66/68 2022-11-23T01:17:43.5578322Z Installing : swig-3.0.12-11.amzn2.0.3.x86_64 67/68 2022-11-23T01:17:43.6254361Z Installing : rpm-sign-4.11.3-48.amzn2.0.2.x86_64 68/68 2022-11-23T01:17:43.6387998Z Verifying : elfutils-libelf-devel-0.176-2.amzn2.x86_64 1/68 2022-11-23T01:17:43.6504690Z Verifying : perl-Thread-Queue-3.02-2.amzn2.noarch 2/68 2022-11-23T01:17:43.6622929Z Verifying : gettext-devel-0.19.8.1-3.amzn2.x86_64 3/68 2022-11-23T01:17:43.6732716Z Verifying : patch-2.7.1-12.amzn2.0.2.x86_64 4/68 2022-11-23T01:17:43.6894992Z Verifying : kernel-devel-4.14.296-222.539.amzn2.x86_64 5/68 2022-11-23T01:17:43.7011432Z Verifying : flex-2.5.37-3.amzn2.0.3.x86_64 6/68 2022-11-23T01:17:43.7099331Z Verifying : pakchois-0.4-10.amzn2.0.2.x86_64 7/68 2022-11-23T01:17:43.7213009Z Verifying : rpm-sign-4.11.3-48.amzn2.0.2.x86_64 8/68 2022-11-23T01:17:43.7315932Z Verifying : glibc-devel-2.26-62.amzn2.x86_64 9/68 2022-11-23T01:17:43.7423377Z Verifying : gcc-gfortran-7.3.1-15.amzn2.x86_64 10/68 2022-11-23T01:17:43.7549703Z Verifying : swig-3.0.12-11.amzn2.0.3.x86_64 11/68 2022-11-23T01:17:43.7658500Z Verifying : byacc-1.9.20130304-3.amzn2.0.2.x86_64 12/68 2022-11-23T01:17:43.7778950Z Verifying : libmpc-1.0.1-3.amzn2.0.2.x86_64 13/68 2022-11-23T01:17:43.7898289Z Verifying : libcilkrts-7.3.1-15.amzn2.x86_64 14/68 2022-11-23T01:17:43.8022751Z Verifying : kernel-headers-4.14.296-222.539.amzn2.x86_64 15/68 2022-11-23T01:17:43.8130907Z Verifying : libproxy-0.4.11-10.amzn2.0.3.x86_64 16/68 2022-11-23T01:17:43.8232446Z Verifying : cscope-15.8-10.amzn2.0.2.x86_64 17/68 2022-11-23T01:17:43.8346172Z Verifying : diffstat-1.57-4.amzn2.0.2.x86_64 18/68 2022-11-23T01:17:43.8460137Z Verifying : 1:doxygen-1.8.5-4.amzn2.x86_64 19/68 2022-11-23T01:17:43.8565554Z Verifying : gcc-c++-7.3.1-15.amzn2.x86_64 20/68 2022-11-23T01:17:43.8672281Z Verifying : libatomic-7.3.1-15.amzn2.x86_64 21/68 2022-11-23T01:17:43.8779657Z Verifying : system-rpm-config-9.1.0-76.amzn2.0.14.noarch 22/68 2022-11-23T01:17:43.8904572Z Verifying : systemtap-devel-4.5-1.amzn2.0.1.x86_64 23/68 2022-11-23T01:17:43.9036308Z Verifying : zlib-devel-1.2.7-19.amzn2.0.2.x86_64 24/68 2022-11-23T01:17:43.9143777Z Verifying : glibc-headers-2.26-62.amzn2.x86_64 25/68 2022-11-23T01:17:43.9250626Z Verifying : perl-Test-Harness-3.28-3.amzn2.noarch 26/68 2022-11-23T01:17:43.9344235Z Verifying : autoconf-2.69-11.amzn2.noarch 27/68 2022-11-23T01:17:43.9459582Z Verifying : libquadmath-7.3.1-15.amzn2.x86_64 28/68 2022-11-23T01:17:43.9582290Z Verifying : intltool-0.50.2-7.amzn2.noarch 29/68 2022-11-23T01:17:43.9691957Z Verifying : apr-util-1.6.1-5.amzn2.0.2.x86_64 30/68 2022-11-23T01:17:43.9786853Z Verifying : cpp-7.3.1-15.amzn2.x86_64 31/68 2022-11-23T01:17:43.9894738Z Verifying : rpm-build-4.11.3-48.amzn2.0.2.x86_64 32/68 2022-11-23T01:17:44.0013354Z Verifying : go-srpm-macros-3.0.15-23.amzn2.0.2.noarch 33/68 2022-11-23T01:17:44.0104879Z Verifying : perl-Data-Dumper-2.145-3.amzn2.0.2.x86_64 34/68 2022-11-23T01:17:44.0194713Z Verifying : perl-srpm-macros-1-8.amzn2.0.1.noarch 35/68 2022-11-23T01:17:44.0300576Z Verifying : gnutls-3.3.29-9.amzn2.0.1.x86_64 36/68 2022-11-23T01:17:44.0395494Z Verifying : subversion-libs-1.7.14-16.amzn2.0.1.x86_64 37/68 2022-11-23T01:17:44.0498094Z Verifying : automake-1.13.4-3.1.amzn2.noarch 38/68 2022-11-23T01:17:44.0595782Z Verifying : apr-util-bdb-1.6.1-5.amzn2.0.2.x86_64 39/68 2022-11-23T01:17:44.0701999Z Verifying : libmpx-7.3.1-15.amzn2.x86_64 40/68 2022-11-23T01:17:44.0801622Z Verifying : avahi-libs-0.6.31-20.amzn2.x86_64 41/68 2022-11-23T01:17:44.0898612Z Verifying : bison-3.0.4-6.amzn2.0.2.x86_64 42/68 2022-11-23T01:17:44.0992546Z Verifying : libgfortran-7.3.1-15.amzn2.x86_64 43/68 2022-11-23T01:17:44.1095817Z Verifying : gdb-8.0.1-36.amzn2.0.1.x86_64 44/68 2022-11-23T01:17:44.1193501Z Verifying : patchutils-0.3.3-4.amzn2.0.1.x86_64 45/68 2022-11-23T01:17:44.1310499Z Verifying : libitm-7.3.1-15.amzn2.x86_64 46/68 2022-11-23T01:17:44.1427482Z Verifying : libtool-2.4.2-22.2.amzn2.0.2.x86_64 47/68 2022-11-23T01:17:44.1534666Z Verifying : gcc-7.3.1-15.amzn2.x86_64 48/68 2022-11-23T01:17:44.1635144Z Verifying : indent-2.2.11-13.amzn2.0.2.x86_64 49/68 2022-11-23T01:17:44.1736087Z Verifying : subversion-1.7.14-16.amzn2.0.1.x86_64 50/68 2022-11-23T01:17:44.1836548Z Verifying : apr-1.7.0-9.amzn2.x86_64 51/68 2022-11-23T01:17:44.1935810Z Verifying : ctags-5.8-13.amzn2.0.2.x86_64 52/68 2022-11-23T01:17:44.2028908Z Verifying : 1:mokutil-0.3.0-10.amzn2.0.1.x86_64 53/68 2022-11-23T01:17:44.2133322Z Verifying : mpfr-3.1.1-4.amzn2.0.2.x86_64 54/68 2022-11-23T01:17:44.2229054Z Verifying : trousers-0.3.14-2.amzn2.0.2.x86_64 55/68 2022-11-23T01:17:44.2328483Z Verifying : neon-0.30.0-3.amzn2.0.2.x86_64 56/68 2022-11-23T01:17:44.2422259Z Verifying : systemtap-4.5-1.amzn2.0.1.x86_64 57/68 2022-11-23T01:17:44.2537321Z Verifying : dwz-0.11-3.amzn2.0.3.x86_64 58/68 2022-11-23T01:17:44.2653342Z Verifying : gettext-common-devel-0.19.8.1-3.amzn2.noarch 59/68 2022-11-23T01:17:44.2753674Z Verifying : systemtap-client-4.5-1.amzn2.0.1.x86_64 60/68 2022-11-23T01:17:44.2856323Z Verifying : efivar-libs-31-4.amzn2.0.4.x86_64 61/68 2022-11-23T01:17:44.2948811Z Verifying : rcs-5.9.0-5.amzn2.0.2.x86_64 62/68 2022-11-23T01:17:44.3055854Z Verifying : 1:emacs-filesystem-27.2-4.amzn2.0.1.noarch 63/68 2022-11-23T01:17:44.3145059Z Verifying : libsanitizer-7.3.1-15.amzn2.x86_64 64/68 2022-11-23T01:17:44.3251854Z Verifying : elfutils-0.176-2.amzn2.x86_64 65/68 2022-11-23T01:17:44.3337932Z Verifying : m4-1.4.16-10.amzn2.0.2.x86_64 66/68 2022-11-23T01:17:44.3428324Z Verifying : perl-XML-Parser-2.41-10.amzn2.0.2.x86_64 67/68 2022-11-23T01:17:44.4170003Z Verifying : libmodman-2.0.1-8.amzn2.0.2.x86_64 68/68 2022-11-23T01:17:44.4173335Z 2022-11-23T01:17:44.4173775Z Installed: 2022-11-23T01:17:44.4174294Z autoconf.noarch 0:2.69-11.amzn2 2022-11-23T01:17:44.4174759Z automake.noarch 0:1.13.4-3.1.amzn2 2022-11-23T01:17:44.4178968Z bison.x86_64 0:3.0.4-6.amzn2.0.2 2022-11-23T01:17:44.4179774Z byacc.x86_64 0:1.9.20130304-3.amzn2.0.2 2022-11-23T01:17:44.4180426Z cscope.x86_64 0:15.8-10.amzn2.0.2 2022-11-23T01:17:44.4180862Z ctags.x86_64 0:5.8-13.amzn2.0.2 2022-11-23T01:17:44.4181294Z diffstat.x86_64 0:1.57-4.amzn2.0.2 2022-11-23T01:17:44.4181725Z doxygen.x86_64 1:1.8.5-4.amzn2 2022-11-23T01:17:44.4182145Z elfutils.x86_64 0:0.176-2.amzn2 2022-11-23T01:17:44.4182560Z flex.x86_64 0:2.5.37-3.amzn2.0.3 2022-11-23T01:17:44.4182977Z gcc.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:17:44.4183372Z gcc-c++.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:17:44.4184151Z gcc-gfortran.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:17:44.4184747Z indent.x86_64 0:2.2.11-13.amzn2.0.2 2022-11-23T01:17:44.4185775Z intltool.noarch 0:0.50.2-7.amzn2 2022-11-23T01:17:44.4186728Z libtool.x86_64 0:2.4.2-22.2.amzn2.0.2 2022-11-23T01:17:44.4187520Z patch.x86_64 0:2.7.1-12.amzn2.0.2 2022-11-23T01:17:44.4188046Z patchutils.x86_64 0:0.3.3-4.amzn2.0.1 2022-11-23T01:17:44.4188447Z rcs.x86_64 0:5.9.0-5.amzn2.0.2 2022-11-23T01:17:44.4188876Z rpm-build.x86_64 0:4.11.3-48.amzn2.0.2 2022-11-23T01:17:44.4190041Z rpm-sign.x86_64 0:4.11.3-48.amzn2.0.2 2022-11-23T01:17:44.4190663Z subversion.x86_64 0:1.7.14-16.amzn2.0.1 2022-11-23T01:17:44.4191569Z swig.x86_64 0:3.0.12-11.amzn2.0.3 2022-11-23T01:17:44.4192376Z system-rpm-config.noarch 0:9.1.0-76.amzn2.0.14 2022-11-23T01:17:44.4193250Z systemtap.x86_64 0:4.5-1.amzn2.0.1 2022-11-23T01:17:44.4193637Z 2022-11-23T01:17:44.4193819Z Dependency Installed: 2022-11-23T01:17:44.4194591Z apr.x86_64 0:1.7.0-9.amzn2 2022-11-23T01:17:44.4195491Z apr-util.x86_64 0:1.6.1-5.amzn2.0.2 2022-11-23T01:17:44.4196018Z apr-util-bdb.x86_64 0:1.6.1-5.amzn2.0.2 2022-11-23T01:17:44.4196441Z avahi-libs.x86_64 0:0.6.31-20.amzn2 2022-11-23T01:17:44.4196871Z cpp.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:17:44.4197415Z dwz.x86_64 0:0.11-3.amzn2.0.3 2022-11-23T01:17:44.4198318Z efivar-libs.x86_64 0:31-4.amzn2.0.4 2022-11-23T01:17:44.4199254Z elfutils-libelf-devel.x86_64 0:0.176-2.amzn2 2022-11-23T01:17:44.4199737Z emacs-filesystem.noarch 1:27.2-4.amzn2.0.1 2022-11-23T01:17:44.4200180Z gdb.x86_64 0:8.0.1-36.amzn2.0.1 2022-11-23T01:17:44.4200609Z gettext-common-devel.noarch 0:0.19.8.1-3.amzn2 2022-11-23T01:17:44.4201071Z gettext-devel.x86_64 0:0.19.8.1-3.amzn2 2022-11-23T01:17:44.4201507Z glibc-devel.x86_64 0:2.26-62.amzn2 2022-11-23T01:17:44.4201921Z glibc-headers.x86_64 0:2.26-62.amzn2 2022-11-23T01:17:44.4202352Z gnutls.x86_64 0:3.3.29-9.amzn2.0.1 2022-11-23T01:17:44.4202820Z go-srpm-macros.noarch 0:3.0.15-23.amzn2.0.2 2022-11-23T01:17:44.4203282Z kernel-devel.x86_64 0:4.14.296-222.539.amzn2 2022-11-23T01:17:44.4203711Z kernel-headers.x86_64 0:4.14.296-222.539.amzn2 2022-11-23T01:17:44.4204145Z libatomic.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:17:44.4204567Z libcilkrts.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:17:44.4204974Z libgfortran.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:17:44.4205392Z libitm.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:17:44.4205806Z libmodman.x86_64 0:2.0.1-8.amzn2.0.2 2022-11-23T01:17:44.4206223Z libmpc.x86_64 0:1.0.1-3.amzn2.0.2 2022-11-23T01:17:44.4206619Z libmpx.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:17:44.4207196Z libproxy.x86_64 0:0.4.11-10.amzn2.0.3 2022-11-23T01:17:44.4207693Z libquadmath.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:17:44.4208122Z libsanitizer.x86_64 0:7.3.1-15.amzn2 2022-11-23T01:17:44.4208541Z m4.x86_64 0:1.4.16-10.amzn2.0.2 2022-11-23T01:17:44.4208950Z mokutil.x86_64 1:0.3.0-10.amzn2.0.1 2022-11-23T01:17:44.4209365Z mpfr.x86_64 0:3.1.1-4.amzn2.0.2 2022-11-23T01:17:44.4209755Z neon.x86_64 0:0.30.0-3.amzn2.0.2 2022-11-23T01:17:44.4210170Z pakchois.x86_64 0:0.4-10.amzn2.0.2 2022-11-23T01:17:44.4210646Z perl-Data-Dumper.x86_64 0:2.145-3.amzn2.0.2 2022-11-23T01:17:44.4211103Z perl-Test-Harness.noarch 0:3.28-3.amzn2 2022-11-23T01:17:44.4211582Z perl-Thread-Queue.noarch 0:3.02-2.amzn2 2022-11-23T01:17:44.4212056Z perl-XML-Parser.x86_64 0:2.41-10.amzn2.0.2 2022-11-23T01:17:44.4212522Z perl-srpm-macros.noarch 0:1-8.amzn2.0.1 2022-11-23T01:17:44.4212965Z subversion-libs.x86_64 0:1.7.14-16.amzn2.0.1 2022-11-23T01:17:44.4213408Z systemtap-client.x86_64 0:4.5-1.amzn2.0.1 2022-11-23T01:17:44.4213860Z systemtap-devel.x86_64 0:4.5-1.amzn2.0.1 2022-11-23T01:17:44.4214276Z trousers.x86_64 0:0.3.14-2.amzn2.0.2 2022-11-23T01:17:44.4214698Z zlib-devel.x86_64 0:1.2.7-19.amzn2.0.2 2022-11-23T01:17:44.4214903Z 2022-11-23T01:17:44.4215012Z Complete! 2022-11-23T01:17:44.4568354Z ++ uname -r 2022-11-23T01:17:44.4577494Z + sudo yum install -y 'kernel-devel-uname-r == 4.14.252-195.483.amzn2.x86_64' 2022-11-23T01:17:44.9685540Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:17:44.9796992Z Existing lock /var/run/yum.pid: another copy is running as pid 39156. 2022-11-23T01:17:44.9797453Z Another app is currently holding the yum lock; waiting for it to exit... 2022-11-23T01:17:44.9805914Z The other application is: yum 2022-11-23T01:17:44.9806289Z Memory : 91 M RSS (309 MB VSZ) 2022-11-23T01:17:44.9807565Z Started: Wed Nov 23 01:17:43 2022 - 00:01 ago 2022-11-23T01:17:44.9807865Z State : Running, pid: 39156 2022-11-23T01:17:46.9833167Z Another app is currently holding the yum lock; waiting for it to exit... 2022-11-23T01:17:46.9840442Z The other application is: yum 2022-11-23T01:17:46.9841077Z Memory : 167 M RSS (386 MB VSZ) 2022-11-23T01:17:46.9842001Z Started: Wed Nov 23 01:17:43 2022 - 00:03 ago 2022-11-23T01:17:46.9842375Z State : Running, pid: 39156 2022-11-23T01:17:49.2609491Z Resolving Dependencies 2022-11-23T01:17:49.2615886Z --> Running transaction check 2022-11-23T01:17:49.2616597Z ---> Package kernel-devel.x86_64 0:4.14.252-195.483.amzn2 will be installed 2022-11-23T01:17:49.5504332Z --> Finished Dependency Resolution 2022-11-23T01:17:49.6347699Z 2022-11-23T01:17:49.6348570Z Dependencies Resolved 2022-11-23T01:17:49.6354346Z 2022-11-23T01:17:49.6354748Z ================================================================================ 2022-11-23T01:17:49.6355353Z Package Arch Version Repository Size 2022-11-23T01:17:49.6355713Z ================================================================================ 2022-11-23T01:17:49.6356015Z Installing: 2022-11-23T01:17:49.6356633Z kernel-devel x86_64 4.14.252-195.483.amzn2 amzn2-core 13 M 2022-11-23T01:17:49.6357074Z 2022-11-23T01:17:49.6357285Z Transaction Summary 2022-11-23T01:17:49.6357860Z ================================================================================ 2022-11-23T01:17:49.6358338Z Install 1 Package 2022-11-23T01:17:49.6358619Z 2022-11-23T01:17:49.6358763Z Total download size: 13 M 2022-11-23T01:17:49.6359039Z Installed size: 60 M 2022-11-23T01:17:49.6359289Z Downloading packages: 2022-11-23T01:17:49.6367956Z Delta RPMs disabled because /usr/bin/applydeltarpm not installed. 2022-11-23T01:17:49.9619672Z Running transaction check 2022-11-23T01:17:49.9802597Z Running transaction test 2022-11-23T01:17:50.3832636Z Transaction test succeeded 2022-11-23T01:17:50.3835582Z Running transaction 2022-11-23T01:18:05.9529659Z Installing : kernel-devel-4.14.252-195.483.amzn2.x86_64 1/1 2022-11-23T01:18:06.0371738Z Verifying : kernel-devel-4.14.252-195.483.amzn2.x86_64 1/1 2022-11-23T01:18:06.0372274Z 2022-11-23T01:18:06.0372468Z Installed: 2022-11-23T01:18:06.0373232Z kernel-devel.x86_64 0:4.14.252-195.483.amzn2 2022-11-23T01:18:06.0373642Z 2022-11-23T01:18:06.0373833Z Complete! 2022-11-23T01:18:06.0696341Z + sudo modprobe backlight 2022-11-23T01:18:06.0931254Z + sudo curl -fsL -o /tmp/nvidia_driver https://s3.amazonaws.com/ossci-linux/nvidia_driver/NVIDIA-Linux-x86_64-515.76.run 2022-11-23T01:18:09.7733023Z + set +e 2022-11-23T01:18:09.7733527Z + sudo /bin/bash /tmp/nvidia_driver -s --no-drm 2022-11-23T01:18:11.1514865Z Verifying archive integrity... OK 2022-11-23T01:18:37.9922100Z Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 515.76................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ 2022-11-23T01:18:38.1342883Z 2022-11-23T01:18:38.1343637Z WARNING: The nvidia-drm module will not be installed. As a result, DRM-KMS will not function with this installation of the NVIDIA driver. 2022-11-23T01:18:38.1343998Z 2022-11-23T01:18:51.3167801Z 2022-11-23T01:18:51.3169029Z WARNING: nvidia-installer was forced to guess the X library path '/usr/lib64' and X module path '/usr/lib64/xorg/modules'; these paths were not queryable from the system. If X fails to find the NVIDIA X driver module, please install the `pkg-config` utility and the X.Org SDK/development package for your distribution and reinstall the driver. 2022-11-23T01:18:51.3171384Z 2022-11-23T01:19:00.2585845Z + NVIDIA_INSTALLATION_STATUS=0 2022-11-23T01:19:00.2586227Z + RESET_GPU=0 2022-11-23T01:19:00.2586640Z + '[' 0 -ne 0 ']' 2022-11-23T01:19:00.2589054Z ++ command -v nvidia-smi 2022-11-23T01:19:00.2591838Z + '[' -x /usr/bin/nvidia-smi ']' 2022-11-23T01:19:00.2595718Z ++ nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0 2022-11-23T01:19:05.3449366Z + INSTALLED_DRIVER_VERSION=515.76 2022-11-23T01:19:05.3449696Z + NVIDIA_SMI_STATUS=0 2022-11-23T01:19:05.3450088Z + '[' 0 -ne 0 ']' 2022-11-23T01:19:05.3450360Z + '[' 0 -eq 1 ']' 2022-11-23T01:19:05.3450669Z + sudo rm -fv /tmp/nvidia_driver 2022-11-23T01:19:05.4097063Z removed ‘/tmp/nvidia_driver’ 2022-11-23T01:19:05.4112238Z + set -e 2022-11-23T01:19:05.4112532Z + sudo modprobe nvidia 2022-11-23T01:19:05.4229550Z + echo 'After installing NVIDIA driver' 2022-11-23T01:19:05.4230160Z + lspci 2022-11-23T01:19:05.4230417Z After installing NVIDIA driver 2022-11-23T01:19:05.4426413Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02) 2022-11-23T01:19:05.4427172Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2022-11-23T01:19:05.4427576Z 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] 2022-11-23T01:19:05.4427972Z 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 01) 2022-11-23T01:19:05.4428338Z 00:02.0 VGA compatible controller: Cirrus Logic GD 5446 2022-11-23T01:19:05.4429368Z 00:03.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2022-11-23T01:19:05.4429860Z 00:1d.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:19:05.4430291Z 00:1e.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1) 2022-11-23T01:19:05.4430713Z 00:1f.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01) 2022-11-23T01:19:05.4431022Z + lsmod 2022-11-23T01:19:05.4448050Z Module Size Used by 2022-11-23T01:19:05.4448417Z nvidia 40808448 0 2022-11-23T01:19:05.4448892Z drm 425984 1 nvidia 2022-11-23T01:19:05.4449181Z i2c_core 77824 2 nvidia,drm 2022-11-23T01:19:05.4449438Z backlight 16384 0 2022-11-23T01:19:05.4451613Z xt_conntrack 16384 1 2022-11-23T01:19:05.4451947Z ipt_MASQUERADE 16384 1 2022-11-23T01:19:05.4452265Z nf_nat_masquerade_ipv4 16384 1 ipt_MASQUERADE 2022-11-23T01:19:05.4452580Z nf_conntrack_netlink 49152 0 2022-11-23T01:19:05.4452870Z nfnetlink 16384 2 nf_conntrack_netlink 2022-11-23T01:19:05.4453159Z xfrm_user 45056 1 2022-11-23T01:19:05.4453448Z xfrm_algo 16384 1 xfrm_user 2022-11-23T01:19:05.4453704Z xt_addrtype 16384 2 2022-11-23T01:19:05.4453969Z iptable_filter 16384 1 2022-11-23T01:19:05.4454236Z iptable_nat 16384 1 2022-11-23T01:19:05.4454502Z nf_conntrack_ipv4 16384 3 2022-11-23T01:19:05.4454798Z nf_defrag_ipv4 16384 1 nf_conntrack_ipv4 2022-11-23T01:19:05.4455112Z nf_nat_ipv4 16384 1 iptable_nat 2022-11-23T01:19:05.4455423Z nf_nat 36864 2 nf_nat_masquerade_ipv4,nf_nat_ipv4 2022-11-23T01:19:05.4455904Z nf_conntrack 155648 7 xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_ipv4,nf_nat,ipt_MASQUERADE,nf_nat_ipv4,nf_conntrack_netlink 2022-11-23T01:19:05.4456315Z br_netfilter 24576 0 2022-11-23T01:19:05.4456594Z bridge 172032 1 br_netfilter 2022-11-23T01:19:05.4456852Z stp 16384 1 bridge 2022-11-23T01:19:05.4457128Z llc 16384 2 bridge,stp 2022-11-23T01:19:05.4457393Z overlay 86016 0 2022-11-23T01:19:05.4457632Z sunrpc 393216 1 2022-11-23T01:19:05.4457885Z dm_mirror 28672 0 2022-11-23T01:19:05.4458164Z dm_region_hash 20480 1 dm_mirror 2022-11-23T01:19:05.4458462Z dm_log 20480 2 dm_region_hash,dm_mirror 2022-11-23T01:19:05.4458772Z dm_mod 143360 2 dm_log,dm_mirror 2022-11-23T01:19:05.4459050Z dax 69632 1 dm_mod 2022-11-23T01:19:05.4459294Z sb_edac 24576 0 2022-11-23T01:19:05.4459555Z crc32_pclmul 16384 0 2022-11-23T01:19:05.4459865Z ghash_clmulni_intel 16384 0 2022-11-23T01:19:05.4460137Z pcbc 16384 0 2022-11-23T01:19:05.4460370Z ata_piix 36864 0 2022-11-23T01:19:05.4460625Z aesni_intel 188416 0 2022-11-23T01:19:05.4460889Z libata 266240 1 ata_piix 2022-11-23T01:19:05.4461148Z aes_x86_64 20480 1 aesni_intel 2022-11-23T01:19:05.4461435Z crypto_simd 16384 1 aesni_intel 2022-11-23T01:19:05.4461704Z mousedev 24576 0 2022-11-23T01:19:05.4461977Z glue_helper 16384 1 aesni_intel 2022-11-23T01:19:05.4462237Z pcc_cpufreq 16384 0 2022-11-23T01:19:05.4462748Z cryptd 28672 3 crypto_simd,ghash_clmulni_intel,aesni_intel 2022-11-23T01:19:05.4463076Z scsi_mod 245760 1 libata 2022-11-23T01:19:05.4463420Z psmouse 32768 0 2022-11-23T01:19:05.4463688Z evdev 20480 3 2022-11-23T01:19:05.4463934Z button 16384 0 2022-11-23T01:19:05.4464166Z ena 114688 0 2022-11-23T01:19:05.4464414Z xen_blkfront 49152 2 2022-11-23T01:19:05.4464671Z crc32c_intel 24576 0 2022-11-23T01:19:05.4464906Z autofs4 49152 2 2022-11-23T01:19:05.4465151Z + modinfo nvidia 2022-11-23T01:19:05.4465642Z filename: /lib/modules/4.14.252-195.483.amzn2.x86_64/kernel/drivers/video/nvidia.ko 2022-11-23T01:19:05.4465977Z firmware: nvidia/515.76/gsp.bin 2022-11-23T01:19:05.4466314Z alias: char-major-195-* 2022-11-23T01:19:05.4466583Z version: 515.76 2022-11-23T01:19:05.4466820Z supported: external 2022-11-23T01:19:05.4467080Z license: NVIDIA 2022-11-23T01:19:05.4467353Z srcversion: 51FD9DD90150B35351AFFBB 2022-11-23T01:19:05.4467647Z alias: pci:v000010DEd*sv*sd*bc06sc80i00* 2022-11-23T01:19:05.4467966Z alias: pci:v000010DEd*sv*sd*bc03sc02i00* 2022-11-23T01:19:05.4468277Z alias: pci:v000010DEd*sv*sd*bc03sc00i00* 2022-11-23T01:19:05.4468612Z depends: i2c-core,drm 2022-11-23T01:19:05.4468866Z retpoline: Y 2022-11-23T01:19:05.4469648Z name: nvidia 2022-11-23T01:19:05.4470067Z vermagic: 4.14.252-195.483.amzn2.x86_64 SMP mod_unload modversions 2022-11-23T01:19:05.4470432Z parm: NvSwitchRegDwords:NvSwitch regkey (charp) 2022-11-23T01:19:05.4470834Z parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp) 2022-11-23T01:19:05.4471202Z parm: NVreg_ResmanDebugLevel:int 2022-11-23T01:19:05.4471484Z parm: NVreg_RmLogonRC:int 2022-11-23T01:19:05.4471786Z parm: NVreg_ModifyDeviceFiles:int 2022-11-23T01:19:05.4472092Z parm: NVreg_DeviceFileUID:int 2022-11-23T01:19:05.4472377Z parm: NVreg_DeviceFileGID:int 2022-11-23T01:19:05.4472677Z parm: NVreg_DeviceFileMode:int 2022-11-23T01:19:05.4473047Z parm: NVreg_InitializeSystemMemoryAllocations:int 2022-11-23T01:19:05.4473406Z parm: NVreg_UsePageAttributeTable:int 2022-11-23T01:19:05.4473731Z parm: NVreg_EnablePCIeGen3:int 2022-11-23T01:19:05.4474024Z parm: NVreg_EnableMSI:int 2022-11-23T01:19:05.4474315Z parm: NVreg_TCEBypassMode:int 2022-11-23T01:19:05.4474614Z parm: NVreg_EnableStreamMemOPs:int 2022-11-23T01:19:05.4474976Z parm: NVreg_RestrictProfilingToAdminUsers:int 2022-11-23T01:19:05.4475372Z parm: NVreg_PreserveVideoMemoryAllocations:int 2022-11-23T01:19:05.4475734Z parm: NVreg_EnableS0ixPowerManagement:int 2022-11-23T01:19:05.4476141Z parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int 2022-11-23T01:19:05.4476540Z parm: NVreg_DynamicPowerManagement:int 2022-11-23T01:19:05.4476942Z parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int 2022-11-23T01:19:05.4477347Z parm: NVreg_EnableGpuFirmware:int 2022-11-23T01:19:05.4477681Z parm: NVreg_EnableGpuFirmwareLogs:int 2022-11-23T01:19:05.4478029Z parm: NVreg_OpenRmEnableUnsupportedGpus:int 2022-11-23T01:19:05.4478399Z parm: NVreg_EnableUserNUMAManagement:int 2022-11-23T01:19:05.4478730Z parm: NVreg_MemoryPoolSize:int 2022-11-23T01:19:05.4479048Z parm: NVreg_KMallocHeapMaxSize:int 2022-11-23T01:19:05.4479356Z parm: NVreg_VMallocHeapMaxSize:int 2022-11-23T01:19:05.4479672Z parm: NVreg_IgnoreMMIOCheck:int 2022-11-23T01:19:05.4479975Z parm: NVreg_NvLinkDisable:int 2022-11-23T01:19:05.4480307Z parm: NVreg_EnablePCIERelaxedOrderingMode:int 2022-11-23T01:19:05.4480663Z parm: NVreg_RegisterPCIDriver:int 2022-11-23T01:19:05.4480992Z parm: NVreg_EnableDbgBreakpoint:int 2022-11-23T01:19:05.4481410Z parm: NVreg_RegistryDwords:charp 2022-11-23T01:19:05.4481748Z parm: NVreg_RegistryDwordsPerDevice:charp 2022-11-23T01:19:05.4482120Z parm: NVreg_RmMsg:charp 2022-11-23T01:19:05.4482406Z parm: NVreg_GpuBlacklist:charp 2022-11-23T01:19:05.4482726Z parm: NVreg_TemporaryFilePath:charp 2022-11-23T01:19:05.4483040Z parm: NVreg_ExcludedGpus:charp 2022-11-23T01:19:05.4483337Z parm: NVreg_DmaRemapPeerMmio:int 2022-11-23T01:19:05.4483644Z parm: rm_firmware_active:charp 2022-11-23T01:19:05.4483917Z + set +e 2022-11-23T01:19:05.4484182Z + nvidia-smi 2022-11-23T01:19:09.2139464Z Wed Nov 23 01:19:09 2022 2022-11-23T01:19:09.2140309Z +-----------------------------------------------------------------------------+ 2022-11-23T01:19:09.2140853Z | NVIDIA-SMI 515.76 Driver Version: 515.76 CUDA Version: 11.7 | 2022-11-23T01:19:09.2141338Z |-------------------------------+----------------------+----------------------+ 2022-11-23T01:19:09.2141863Z | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | 2022-11-23T01:19:09.2142374Z | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | 2022-11-23T01:19:09.2142738Z | | | MIG M. | 2022-11-23T01:19:09.2143033Z |===============================+======================+======================| 2022-11-23T01:19:09.2186504Z | 0 Tesla M60 Off | 00000000:00:1D.0 Off | 8589873528 | 2022-11-23T01:19:09.2186859Z | N/A 22C P0 38W / 150W | 0MiB / 7680MiB | 0% Default | 2022-11-23T01:19:09.2187177Z | | | N/A | 2022-11-23T01:19:09.2187639Z +-------------------------------+----------------------+----------------------+ 2022-11-23T01:19:09.2234080Z | 1 Tesla M60 Off | 00000000:00:1E.0 Off | 0 | 2022-11-23T01:19:09.2234447Z | N/A 30C P0 38W / 150W | 0MiB / 7680MiB | 100% Default | 2022-11-23T01:19:09.2235011Z | | | N/A | 2022-11-23T01:19:09.2235464Z +-------------------------------+----------------------+----------------------+ 2022-11-23T01:19:09.2235822Z 2022-11-23T01:19:09.2236305Z +-----------------------------------------------------------------------------+ 2022-11-23T01:19:09.2236677Z | Processes: | 2022-11-23T01:19:09.2237010Z | GPU GI CI PID Type Process name GPU Memory | 2022-11-23T01:19:09.2237347Z | ID ID Usage | 2022-11-23T01:19:09.2237643Z |=============================================================================| 2022-11-23T01:19:09.2240209Z | No running processes found | 2022-11-23T01:19:09.2240684Z +-----------------------------------------------------------------------------+ 2022-11-23T01:19:09.7518969Z + NVIDIA_SMI_STATUS=0 2022-11-23T01:19:09.7519352Z + '[' 0 -eq 0 ']' 2022-11-23T01:19:09.7519693Z + echo 'INFO: Ignoring allowed status 0' 2022-11-23T01:19:09.7519977Z + set -e 2022-11-23T01:19:09.7520233Z INFO: Ignoring allowed status 0 2022-11-23T01:19:09.7524486Z == Installing nvidia container toolkit for amzn2 == 2022-11-23T01:19:09.7528915Z + sudo yum install -y yum-utils 2022-11-23T01:19:10.2979859Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:19:10.5721604Z Package yum-utils-1.1.31-46.amzn2.0.1.noarch already installed and latest version 2022-11-23T01:19:10.5722307Z Nothing to do 2022-11-23T01:19:10.5927951Z + sudo yum-config-manager --add-repo https://nvidia.github.io/nvidia-docker/amzn2/nvidia-docker.repo 2022-11-23T01:19:11.1206795Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:19:11.1503174Z adding repo from: https://nvidia.github.io/nvidia-docker/amzn2/nvidia-docker.repo 2022-11-23T01:19:11.1504039Z grabbing file https://nvidia.github.io/nvidia-docker/amzn2/nvidia-docker.repo to /etc/yum.repos.d/nvidia-docker.repo 2022-11-23T01:19:11.1504610Z repo saved to /etc/yum.repos.d/nvidia-docker.repo 2022-11-23T01:19:11.1646796Z + sudo yum install -y nvidia-docker2 2022-11-23T01:19:11.6861058Z Loaded plugins: extras_suggestions, langpacks, priorities, update-motd 2022-11-23T01:19:11.7304982Z Retrieving key from https://nvidia.github.io/libnvidia-container/gpgkey 2022-11-23T01:19:11.7406870Z Importing GPG key 0xF796ECB0: 2022-11-23T01:19:11.7407269Z Userid : "NVIDIA CORPORATION (Open Source Projects) " 2022-11-23T01:19:11.7407659Z Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0 2022-11-23T01:19:11.7408144Z From : https://nvidia.github.io/libnvidia-container/gpgkey 2022-11-23T01:19:12.1263432Z Retrieving key from https://nvidia.github.io/nvidia-container-runtime/gpgkey 2022-11-23T01:19:12.1350157Z Importing GPG key 0xF796ECB0: 2022-11-23T01:19:12.1350649Z Userid : "NVIDIA CORPORATION (Open Source Projects) " 2022-11-23T01:19:12.1351262Z Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0 2022-11-23T01:19:12.1351774Z From : https://nvidia.github.io/nvidia-container-runtime/gpgkey 2022-11-23T01:19:12.3566112Z Retrieving key from https://nvidia.github.io/nvidia-docker/gpgkey 2022-11-23T01:19:12.3654959Z Importing GPG key 0xF796ECB0: 2022-11-23T01:19:12.3655520Z Userid : "NVIDIA CORPORATION (Open Source Projects) " 2022-11-23T01:19:12.3656050Z Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0 2022-11-23T01:19:12.3656504Z From : https://nvidia.github.io/nvidia-docker/gpgkey 2022-11-23T01:19:14.0778303Z Resolving Dependencies 2022-11-23T01:19:14.0785080Z --> Running transaction check 2022-11-23T01:19:14.0785938Z ---> Package nvidia-docker2.noarch 0:2.11.0-1 will be installed 2022-11-23T01:19:14.0811029Z --> Processing Dependency: nvidia-container-toolkit >= 1.10.0-1 for package: nvidia-docker2-2.11.0-1.noarch 2022-11-23T01:19:14.1198147Z --> Running transaction check 2022-11-23T01:19:14.1198661Z ---> Package nvidia-container-toolkit.x86_64 0:1.11.0-1 will be installed 2022-11-23T01:19:14.1341713Z --> Processing Dependency: nvidia-container-toolkit-base = 1.11.0-1 for package: nvidia-container-toolkit-1.11.0-1.x86_64 2022-11-23T01:19:14.1352438Z --> Processing Dependency: libnvidia-container-tools < 2.0.0 for package: nvidia-container-toolkit-1.11.0-1.x86_64 2022-11-23T01:19:14.1481560Z --> Processing Dependency: libnvidia-container-tools >= 1.11.0-1 for package: nvidia-container-toolkit-1.11.0-1.x86_64 2022-11-23T01:19:14.1482046Z --> Running transaction check 2022-11-23T01:19:14.1483016Z ---> Package libnvidia-container-tools.x86_64 0:1.11.0-1 will be installed 2022-11-23T01:19:14.1493212Z --> Processing Dependency: libnvidia-container1(x86-64) >= 1.11.0-1 for package: libnvidia-container-tools-1.11.0-1.x86_64 2022-11-23T01:19:14.1520384Z --> Processing Dependency: libnvidia-container.so.1(NVC_1.0)(64bit) for package: libnvidia-container-tools-1.11.0-1.x86_64 2022-11-23T01:19:14.1521111Z --> Processing Dependency: libnvidia-container.so.1()(64bit) for package: libnvidia-container-tools-1.11.0-1.x86_64 2022-11-23T01:19:14.1521965Z ---> Package nvidia-container-toolkit-base.x86_64 0:1.11.0-1 will be installed 2022-11-23T01:19:14.1524482Z --> Running transaction check 2022-11-23T01:19:14.1525164Z ---> Package libnvidia-container1.x86_64 0:1.11.0-1 will be installed 2022-11-23T01:19:14.4449109Z --> Finished Dependency Resolution 2022-11-23T01:19:14.5194450Z 2022-11-23T01:19:14.5194791Z Dependencies Resolved 2022-11-23T01:19:14.5207979Z 2022-11-23T01:19:14.5208274Z ================================================================================ 2022-11-23T01:19:14.5208833Z Package Arch Version Repository Size 2022-11-23T01:19:14.5209729Z ================================================================================ 2022-11-23T01:19:14.5210109Z Installing: 2022-11-23T01:19:14.5211233Z nvidia-docker2 noarch 2.11.0-1 libnvidia-container 8.7 k 2022-11-23T01:19:14.5211989Z Installing for dependencies: 2022-11-23T01:19:14.5213004Z libnvidia-container-tools x86_64 1.11.0-1 libnvidia-container 49 k 2022-11-23T01:19:14.5213883Z libnvidia-container1 x86_64 1.11.0-1 libnvidia-container 1.0 M 2022-11-23T01:19:14.5214389Z nvidia-container-toolkit x86_64 1.11.0-1 libnvidia-container 780 k 2022-11-23T01:19:14.5214932Z nvidia-container-toolkit-base x86_64 1.11.0-1 libnvidia-container 2.5 M 2022-11-23T01:19:14.5215196Z 2022-11-23T01:19:14.5215313Z Transaction Summary 2022-11-23T01:19:14.5215606Z ================================================================================ 2022-11-23T01:19:14.5215906Z Install 1 Package (+4 Dependent packages) 2022-11-23T01:19:14.5216115Z 2022-11-23T01:19:14.5216250Z Total download size: 4.3 M 2022-11-23T01:19:14.5216517Z Installed size: 12 M 2022-11-23T01:19:14.5216766Z Downloading packages: 2022-11-23T01:19:14.6436685Z -------------------------------------------------------------------------------- 2022-11-23T01:19:14.6437161Z Total 35 MB/s | 4.3 MB 00:00 2022-11-23T01:19:14.6481424Z Running transaction check 2022-11-23T01:19:14.6651038Z Running transaction test 2022-11-23T01:19:14.6811817Z Transaction test succeeded 2022-11-23T01:19:14.6815126Z Running transaction 2022-11-23T01:19:15.1705933Z Installing : nvidia-container-toolkit-base-1.11.0-1.x86_64 1/5 2022-11-23T01:19:15.2070262Z Installing : libnvidia-container1-1.11.0-1.x86_64 2/5 2022-11-23T01:19:15.3161302Z Installing : libnvidia-container-tools-1.11.0-1.x86_64 3/5 2022-11-23T01:19:15.3405267Z Installing : nvidia-container-toolkit-1.11.0-1.x86_64 4/5 2022-11-23T01:19:15.3794978Z Installing : nvidia-docker2-2.11.0-1.noarch 5/5 2022-11-23T01:19:15.3910677Z Verifying : libnvidia-container1-1.11.0-1.x86_64 1/5 2022-11-23T01:19:15.4028164Z Verifying : nvidia-container-toolkit-base-1.11.0-1.x86_64 2/5 2022-11-23T01:19:15.4136185Z Verifying : nvidia-container-toolkit-1.11.0-1.x86_64 3/5 2022-11-23T01:19:15.4235156Z Verifying : libnvidia-container-tools-1.11.0-1.x86_64 4/5 2022-11-23T01:19:15.5062967Z Verifying : nvidia-docker2-2.11.0-1.noarch 5/5 2022-11-23T01:19:15.5063253Z 2022-11-23T01:19:15.5063364Z Installed: 2022-11-23T01:19:15.5063748Z nvidia-docker2.noarch 0:2.11.0-1 2022-11-23T01:19:15.5063973Z 2022-11-23T01:19:15.5064100Z Dependency Installed: 2022-11-23T01:19:15.5064532Z libnvidia-container-tools.x86_64 0:1.11.0-1 2022-11-23T01:19:15.5065036Z libnvidia-container1.x86_64 0:1.11.0-1 2022-11-23T01:19:15.5065500Z nvidia-container-toolkit.x86_64 0:1.11.0-1 2022-11-23T01:19:15.5065984Z nvidia-container-toolkit-base.x86_64 0:1.11.0-1 2022-11-23T01:19:15.5066224Z 2022-11-23T01:19:15.5066325Z Complete! 2022-11-23T01:19:15.6062980Z + sudo systemctl restart docker 2022-11-23T01:19:24.0446783Z Command completed after 1 attempt(s). 2022-11-23T01:19:24.0447067Z 2022-11-23T01:19:24.0450025Z ##[warning]The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/ 2022-11-23T01:19:24.0508047Z ##[group]Run python3 -m pip install psutil==5.9.1 2022-11-23T01:19:24.0508493Z python3 -m pip install psutil==5.9.1 2022-11-23T01:19:24.0509508Z python3 -m pip install pynvml==11.4.1 2022-11-23T01:19:24.0509956Z python3 -m tools.stats.monitor > usage_log.txt 2>&1 & 2022-11-23T01:19:24.0510366Z echo "monitor-script-pid=${!}" >> "${GITHUB_OUTPUT}" 2022-11-23T01:19:24.0525154Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:19:24.0525477Z env: 2022-11-23T01:19:24.0525741Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:19:24.0526005Z GPU_FLAG: --gpus all 2022-11-23T01:19:24.0526275Z ##[endgroup] 2022-11-23T01:19:24.9286432Z Defaulting to user installation because normal site-packages is not writeable 2022-11-23T01:19:25.3212231Z Collecting psutil==5.9.1 2022-11-23T01:19:25.3456964Z Downloading psutil-5.9.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (281 kB) 2022-11-23T01:19:25.4133644Z Installing collected packages: psutil 2022-11-23T01:19:25.5695708Z Successfully installed psutil-5.9.1 2022-11-23T01:19:26.0399596Z Defaulting to user installation because normal site-packages is not writeable 2022-11-23T01:19:26.1106096Z Collecting pynvml==11.4.1 2022-11-23T01:19:26.1291625Z Downloading pynvml-11.4.1-py3-none-any.whl (46 kB) 2022-11-23T01:19:26.1789911Z Installing collected packages: pynvml 2022-11-23T01:19:26.2323459Z Successfully installed pynvml-11.4.1 2022-11-23T01:19:26.2832126Z Prepare all required actions 2022-11-23T01:19:26.2832500Z Getting action download info 2022-11-23T01:19:26.4298392Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:4a8bfae15cc25cc0785c1603ee87a9da8fd442ea) 2022-11-23T01:19:26.6258460Z Download action repository 'actions/download-artifact@v3' (SHA:9782bd6a9848b53b110e712e20e42d89988822b7) 2022-11-23T01:19:26.7697759Z ##[group]Run ./.github/actions/download-build-artifacts 2022-11-23T01:19:26.7698041Z with: 2022-11-23T01:19:26.7698315Z name: linux-bionic-cuda11.6-py3.10-gcc7 2022-11-23T01:19:26.7698571Z env: 2022-11-23T01:19:26.7698798Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:19:26.7699067Z GPU_FLAG: --gpus all 2022-11-23T01:19:26.7699304Z ##[endgroup] 2022-11-23T01:19:26.8413216Z ##[group]Run seemethere/download-artifact-s3@v4 2022-11-23T01:19:26.8413511Z with: 2022-11-23T01:19:26.8413771Z name: linux-bionic-cuda11.6-py3.10-gcc7 2022-11-23T01:19:26.8414074Z s3-bucket: gha-artifacts 2022-11-23T01:19:26.8414380Z region: us-east-1 2022-11-23T01:19:26.8414616Z env: 2022-11-23T01:19:26.8414846Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:19:26.8415094Z GPU_FLAG: --gpus all 2022-11-23T01:19:26.8415344Z ##[endgroup] 2022-11-23T01:19:27.3893106Z Found 1 objects with prefix pytorch/pytorch/3528293562/linux-bionic-cuda11.6-py3.10-gcc7/ 2022-11-23T01:19:27.3893730Z Starting download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2022-11-23T01:19:37.1923145Z Finished download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2022-11-23T01:19:37.1923673Z 2022-11-23T01:19:37.1928555Z ##[warning]The `set-output` command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/ 2022-11-23T01:19:37.1929878Z Artifact download has finished successfully 2022-11-23T01:19:37.2104051Z ##[group]Run unzip -o artifacts.zip 2022-11-23T01:19:37.2104387Z unzip -o artifacts.zip 2022-11-23T01:19:37.2118664Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:19:37.2118969Z env: 2022-11-23T01:19:37.2119216Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:19:37.2119469Z GPU_FLAG: --gpus all 2022-11-23T01:19:37.2119717Z ##[endgroup] 2022-11-23T01:19:37.2206661Z Archive: artifacts.zip 2022-11-23T01:19:37.2208608Z creating: dist/ 2022-11-23T01:19:39.3164517Z inflating: dist/torch-1.14.0a0+git1cfd385-cp310-cp310-linux_x86_64.whl 2022-11-23T01:19:39.3164972Z creating: build/custom_test_artifacts/ 2022-11-23T01:19:39.3165399Z creating: build/custom_test_artifacts/custom-op-build/ 2022-11-23T01:19:39.3166107Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2022-11-23T01:19:39.3173210Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeOutput.log 2022-11-23T01:19:39.3173765Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/ 2022-11-23T01:19:39.3174342Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeSystem.cmake 2022-11-23T01:19:39.3174923Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/ 2022-11-23T01:19:39.3175475Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/tmp/ 2022-11-23T01:19:39.3177586Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/CMakeCCompilerId.c 2022-11-23T01:19:39.3178834Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdC/a.out 2022-11-23T01:19:39.3179410Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/ 2022-11-23T01:19:39.3179979Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/tmp/ 2022-11-23T01:19:39.3182703Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/CMakeCXXCompilerId.cpp 2022-11-23T01:19:39.3184135Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCXX/a.out 2022-11-23T01:19:39.3185598Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_C.bin 2022-11-23T01:19:39.3186393Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeCCompiler.cmake 2022-11-23T01:19:39.3188442Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CXX.bin 2022-11-23T01:19:39.3189691Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeCXXCompiler.cmake 2022-11-23T01:19:39.3190356Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/ 2022-11-23T01:19:39.3190930Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/ 2022-11-23T01:19:39.3245351Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2022-11-23T01:19:39.3246101Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2022-11-23T01:19:39.3246844Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2022-11-23T01:19:39.3247760Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2022-11-23T01:19:39.3248522Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2022-11-23T01:19:39.3249252Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2022-11-23T01:19:39.3249950Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2022-11-23T01:19:39.3250662Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2022-11-23T01:19:39.3251545Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2022-11-23T01:19:39.3293893Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2022-11-23T01:19:39.3335272Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2022-11-23T01:19:39.3336439Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2022-11-23T01:19:39.3337330Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2022-11-23T01:19:39.3338005Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.reg.c 2022-11-23T01:19:39.3338811Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin 2022-11-23T01:19:39.3339810Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2022-11-23T01:19:39.3340799Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.o 2022-11-23T01:19:39.3342910Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu 2022-11-23T01:19:39.3416045Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CompilerIdCUDA/a.out 2022-11-23T01:19:39.3489364Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CUDA.bin 2022-11-23T01:19:39.3490024Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.22.1/CMakeCUDACompiler.cmake 2022-11-23T01:19:39.3490575Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2022-11-23T01:19:39.3491480Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeError.log 2022-11-23T01:19:39.3492077Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2022-11-23T01:19:39.3492630Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2022-11-23T01:19:39.3493380Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2022-11-23T01:19:39.3494023Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2022-11-23T01:19:39.3494632Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2022-11-23T01:19:39.3495342Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2022-11-23T01:19:39.3496075Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2022-11-23T01:19:39.3497169Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2022-11-23T01:19:39.3497926Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2022-11-23T01:19:39.3498668Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2022-11-23T01:19:39.3499270Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2022-11-23T01:19:39.3520893Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2022-11-23T01:19:39.3635179Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2022-11-23T01:19:39.3635806Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2022-11-23T01:19:39.3636427Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2022-11-23T01:19:39.3637080Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2022-11-23T01:19:39.3637692Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2022-11-23T01:19:39.3638298Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2022-11-23T01:19:39.3639046Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2022-11-23T01:19:39.3640168Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2022-11-23T01:19:39.3640984Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2022-11-23T01:19:39.3641751Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2022-11-23T01:19:39.3642353Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2022-11-23T01:19:39.3663453Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2022-11-23T01:19:39.3746553Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2022-11-23T01:19:39.3747211Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2022-11-23T01:19:39.3747815Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2022-11-23T01:19:39.3748392Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2022-11-23T01:19:39.3749357Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2022-11-23T01:19:39.3751096Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2022-11-23T01:19:39.3751633Z inflating: build/custom_test_artifacts/custom-op-build/detect_cuda_version.cc 2022-11-23T01:19:39.3754990Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2022-11-23T01:19:39.3755656Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2022-11-23T01:19:39.3756490Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2022-11-23T01:19:39.3849475Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2022-11-23T01:19:39.3912651Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2022-11-23T01:19:39.3913133Z creating: build/custom_test_artifacts/jit-hook-build/ 2022-11-23T01:19:39.3913600Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2022-11-23T01:19:39.3920395Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeOutput.log 2022-11-23T01:19:39.3920954Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/ 2022-11-23T01:19:39.3921507Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeSystem.cmake 2022-11-23T01:19:39.3922077Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/ 2022-11-23T01:19:39.3922613Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/tmp/ 2022-11-23T01:19:39.3924750Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/CMakeCCompilerId.c 2022-11-23T01:19:39.3925972Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdC/a.out 2022-11-23T01:19:39.3926547Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/ 2022-11-23T01:19:39.3927100Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/tmp/ 2022-11-23T01:19:39.3929861Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/CMakeCXXCompilerId.cpp 2022-11-23T01:19:39.3930904Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCXX/a.out 2022-11-23T01:19:39.3932822Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_C.bin 2022-11-23T01:19:39.3933437Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeCCompiler.cmake 2022-11-23T01:19:39.3935131Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CXX.bin 2022-11-23T01:19:39.3936352Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeCXXCompiler.cmake 2022-11-23T01:19:39.3936940Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/ 2022-11-23T01:19:39.3937496Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/ 2022-11-23T01:19:39.3992486Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2022-11-23T01:19:39.3993220Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2022-11-23T01:19:39.3993953Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2022-11-23T01:19:39.3994702Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2022-11-23T01:19:39.3995418Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2022-11-23T01:19:39.3996126Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2022-11-23T01:19:39.3996823Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2022-11-23T01:19:39.3997524Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2022-11-23T01:19:39.3998498Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2022-11-23T01:19:39.4040685Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2022-11-23T01:19:39.4082005Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2022-11-23T01:19:39.4083061Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2022-11-23T01:19:39.4084050Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2022-11-23T01:19:39.4084784Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.reg.c 2022-11-23T01:19:39.4085651Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin 2022-11-23T01:19:39.4086470Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2022-11-23T01:19:39.4087327Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.o 2022-11-23T01:19:39.4089311Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu 2022-11-23T01:19:39.4162520Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CompilerIdCUDA/a.out 2022-11-23T01:19:39.4235927Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CUDA.bin 2022-11-23T01:19:39.4236586Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.22.1/CMakeCUDACompiler.cmake 2022-11-23T01:19:39.4237146Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2022-11-23T01:19:39.4237898Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeError.log 2022-11-23T01:19:39.4238467Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2022-11-23T01:19:39.4239024Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2022-11-23T01:19:39.4239611Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2022-11-23T01:19:39.4240248Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2022-11-23T01:19:39.4240860Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2022-11-23T01:19:39.4241502Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2022-11-23T01:19:39.4242519Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2022-11-23T01:19:39.4243710Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2022-11-23T01:19:39.4244477Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2022-11-23T01:19:39.4245221Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2022-11-23T01:19:39.4245947Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2022-11-23T01:19:39.4266930Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2022-11-23T01:19:39.4330504Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2022-11-23T01:19:39.4331150Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2022-11-23T01:19:39.4331772Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2022-11-23T01:19:39.4332348Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2022-11-23T01:19:39.4333357Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2022-11-23T01:19:39.4334608Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2022-11-23T01:19:39.4335316Z inflating: build/custom_test_artifacts/jit-hook-build/detect_cuda_version.cc 2022-11-23T01:19:39.4338183Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2022-11-23T01:19:39.4338796Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2022-11-23T01:19:39.4339780Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2022-11-23T01:19:39.4389126Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2022-11-23T01:19:39.4389913Z creating: build/custom_test_artifacts/custom-backend-build/ 2022-11-23T01:19:39.4390435Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2022-11-23T01:19:39.4397088Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeOutput.log 2022-11-23T01:19:39.4397663Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/ 2022-11-23T01:19:39.4398256Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeSystem.cmake 2022-11-23T01:19:39.4398850Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/ 2022-11-23T01:19:39.4399438Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/tmp/ 2022-11-23T01:19:39.4401469Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/CMakeCCompilerId.c 2022-11-23T01:19:39.4402716Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdC/a.out 2022-11-23T01:19:39.4403328Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/ 2022-11-23T01:19:39.4403936Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/tmp/ 2022-11-23T01:19:39.4406519Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/CMakeCXXCompilerId.cpp 2022-11-23T01:19:39.4407674Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCXX/a.out 2022-11-23T01:19:39.4409624Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_C.bin 2022-11-23T01:19:39.4410287Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeCCompiler.cmake 2022-11-23T01:19:39.4411670Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CXX.bin 2022-11-23T01:19:39.4412988Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeCXXCompiler.cmake 2022-11-23T01:19:39.4413609Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/ 2022-11-23T01:19:39.4414212Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/ 2022-11-23T01:19:39.4468569Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2022-11-23T01:19:39.4469600Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2022-11-23T01:19:39.4470623Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2022-11-23T01:19:39.4471584Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2022-11-23T01:19:39.4472352Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2022-11-23T01:19:39.4473095Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2022-11-23T01:19:39.4473947Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2022-11-23T01:19:39.4474700Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2022-11-23T01:19:39.4475657Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2022-11-23T01:19:39.4517667Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2022-11-23T01:19:39.4559087Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2022-11-23T01:19:39.4560200Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2022-11-23T01:19:39.4561139Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2022-11-23T01:19:39.4561816Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.reg.c 2022-11-23T01:19:39.4562596Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin 2022-11-23T01:19:39.4563685Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2022-11-23T01:19:39.4564692Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/tmp/a_dlink.o 2022-11-23T01:19:39.4566774Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu 2022-11-23T01:19:39.4640030Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CompilerIdCUDA/a.out 2022-11-23T01:19:39.4726335Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeDetermineCompilerABI_CUDA.bin 2022-11-23T01:19:39.4727436Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.22.1/CMakeCUDACompiler.cmake 2022-11-23T01:19:39.4728021Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2022-11-23T01:19:39.4729875Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeError.log 2022-11-23T01:19:39.4730631Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2022-11-23T01:19:39.4731217Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2022-11-23T01:19:39.4732393Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2022-11-23T01:19:39.4733520Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2022-11-23T01:19:39.4734609Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2022-11-23T01:19:39.4735827Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2022-11-23T01:19:39.4736989Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2022-11-23T01:19:39.4739017Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2022-11-23T01:19:39.4740103Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2022-11-23T01:19:39.4741249Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2022-11-23T01:19:39.4742291Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2022-11-23T01:19:39.4748572Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2022-11-23T01:19:39.4946341Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2022-11-23T01:19:39.4946998Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2022-11-23T01:19:39.4947799Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2022-11-23T01:19:39.4948869Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2022-11-23T01:19:39.4950267Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2022-11-23T01:19:39.4951504Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2022-11-23T01:19:39.4952697Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2022-11-23T01:19:39.4954863Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2022-11-23T01:19:39.4955710Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2022-11-23T01:19:39.4956767Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2022-11-23T01:19:39.4957751Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2022-11-23T01:19:39.4985344Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2022-11-23T01:19:39.5064633Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2022-11-23T01:19:39.5065344Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2022-11-23T01:19:39.5066386Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2022-11-23T01:19:39.5067004Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2022-11-23T01:19:39.5069625Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2022-11-23T01:19:39.5072309Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2022-11-23T01:19:39.5072894Z inflating: build/custom_test_artifacts/custom-backend-build/detect_cuda_version.cc 2022-11-23T01:19:39.5076961Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2022-11-23T01:19:39.5077739Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2022-11-23T01:19:39.5079512Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2022-11-23T01:19:39.5217136Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2022-11-23T01:19:39.5263013Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2022-11-23T01:19:39.5263394Z creating: build/lib/ 2022-11-23T01:19:39.5329656Z inflating: build/lib/libgtest.a 2022-11-23T01:19:39.5330671Z inflating: build/lib/libclog.a 2022-11-23T01:19:39.5341119Z inflating: build/lib/libpthreadpool.a 2022-11-23T01:19:39.5447576Z inflating: build/lib/libprotobuf-lite.a 2022-11-23T01:19:39.5456781Z inflating: build/lib/libittnotify.a 2022-11-23T01:19:39.5548387Z inflating: build/lib/libbenchmark.a 2022-11-23T01:19:39.5580773Z inflating: build/lib/libtensorpipe_uv.a 2022-11-23T01:19:39.5657660Z inflating: build/lib/libasmjit.a 2022-11-23T01:19:39.6193199Z inflating: build/lib/libprotobuf.a 2022-11-23T01:19:39.6325914Z inflating: build/lib/libgloo.a 2022-11-23T01:19:39.6358302Z inflating: build/lib/libfmt.a 2022-11-23T01:19:39.6360176Z inflating: build/lib/libcaffe2_nvrtc.so 2022-11-23T01:19:39.6360716Z inflating: build/lib/libfoxi_loader.a 2022-11-23T01:19:39.6440507Z inflating: build/lib/libc10.so 2022-11-23T01:19:39.6441738Z inflating: build/lib/libtorch_global_deps.so 2022-11-23T01:19:39.6451814Z inflating: build/lib/libcpuinfo.a 2022-11-23T01:19:39.6460690Z inflating: build/lib/libcpuinfo_internals.a 2022-11-23T01:19:39.7031450Z inflating: build/lib/libprotoc.a 2022-11-23T01:19:39.7047217Z inflating: build/lib/libqnnpack.a 2022-11-23T01:19:39.7049817Z inflating: build/lib/libnnpack_reference_layers.a 2022-11-23T01:19:39.7072980Z inflating: build/lib/libpytorch_qnnpack.a 2022-11-23T01:19:39.7091855Z inflating: build/lib/libgmock.a 2022-11-23T01:19:39.7114476Z inflating: build/lib/libnnpack.a 2022-11-23T01:19:39.7115057Z inflating: build/lib/libgtest_main.a 2022-11-23T01:19:39.7116117Z inflating: build/lib/libbenchmark_main.a 2022-11-23T01:19:40.6988618Z inflating: build/lib/libdnnl.a 2022-11-23T01:19:40.7130183Z inflating: build/lib/libXNNPACK.a 2022-11-23T01:19:40.7785814Z inflating: build/lib/libtensorpipe.a 2022-11-23T01:19:40.7839083Z inflating: build/lib/libc10_cuda.so 2022-11-23T01:19:40.9382265Z inflating: build/lib/libfbgemm.a 2022-11-23T01:19:40.9383028Z inflating: build/lib/libgmock_main.a 2022-11-23T01:19:41.0530915Z inflating: build/lib/libdnnl_graph.a 2022-11-23T01:19:41.1046838Z inflating: build/lib/libkineto.a 2022-11-23T01:19:41.1336432Z inflating: build/lib/libtensorpipe_cuda.a 2022-11-23T01:19:41.1381846Z inflating: build/lib/libcaffe2_protos.a 2022-11-23T01:19:41.1429786Z inflating: build/lib/libonnx_proto.a 2022-11-23T01:19:41.2108081Z inflating: build/lib/libonnx.a 2022-11-23T01:19:41.2541663Z inflating: build/lib/libgloo_cuda.a 2022-11-23T01:19:43.6154318Z inflating: build/lib/libtorch_cpu.so 2022-11-23T01:19:45.7533085Z inflating: build/lib/libtorch_cuda.so 2022-11-23T01:19:45.7533812Z inflating: build/lib/libtorch.so 2022-11-23T01:19:46.7466890Z inflating: build/lib/libtorch_cuda_linalg.so 2022-11-23T01:19:46.7469643Z inflating: build/lib/libc10d_cuda_test.so 2022-11-23T01:19:46.7493602Z inflating: build/lib/libjitbackend_test.so 2022-11-23T01:19:46.7554320Z inflating: build/lib/libtorchbind_test.so 2022-11-23T01:19:46.7585104Z inflating: build/lib/libbackend_with_compiler.so 2022-11-23T01:19:46.7590891Z inflating: build/lib/libshm.so 2022-11-23T01:19:46.9397645Z inflating: build/lib/libtorch_python.so 2022-11-23T01:19:46.9437484Z inflating: build/lib/libnnapi_backend.so 2022-11-23T01:19:46.9437793Z creating: build/bin/ 2022-11-23T01:19:46.9490207Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2022-11-23T01:19:46.9545319Z inflating: build/bin/c10_DeviceGuard_test 2022-11-23T01:19:46.9599174Z inflating: build/bin/c10_Device_test 2022-11-23T01:19:46.9661935Z inflating: build/bin/c10_DispatchKeySet_test 2022-11-23T01:19:46.9713191Z inflating: build/bin/c10_StreamGuard_test 2022-11-23T01:19:46.9765771Z inflating: build/bin/c10_SymInt_test 2022-11-23T01:19:46.9825064Z inflating: build/bin/c10_InlineDeviceGuard_test 2022-11-23T01:19:46.9884991Z inflating: build/bin/c10_InlineStreamGuard_test 2022-11-23T01:19:46.9945699Z inflating: build/bin/c10_SizesAndStrides_test 2022-11-23T01:19:46.9997059Z inflating: build/bin/c10_Array_test 2022-11-23T01:19:47.0053815Z inflating: build/bin/c10_Bitset_test 2022-11-23T01:19:47.0109055Z inflating: build/bin/c10_C++17_test 2022-11-23T01:19:47.0160422Z inflating: build/bin/c10_ConstexprCrc_test 2022-11-23T01:19:47.0212696Z inflating: build/bin/c10_DeadlockDetection_test 2022-11-23T01:19:47.0265679Z inflating: build/bin/c10_Half_test 2022-11-23T01:19:47.0332851Z inflating: build/bin/c10_Metaprogramming_test 2022-11-23T01:19:47.0393458Z inflating: build/bin/c10_LeftRight_test 2022-11-23T01:19:47.0549244Z inflating: build/bin/c10_SmallVectorTest 2022-11-23T01:19:47.0603190Z inflating: build/bin/c10_Synchronized_test 2022-11-23T01:19:47.0664326Z inflating: build/bin/c10_ThreadLocal_test 2022-11-23T01:19:47.0720620Z inflating: build/bin/c10_TypeIndex_test 2022-11-23T01:19:47.0774331Z inflating: build/bin/c10_TypeList_test 2022-11-23T01:19:47.0825625Z inflating: build/bin/c10_TypeTraits_test 2022-11-23T01:19:47.0880954Z inflating: build/bin/c10_accumulate_test 2022-11-23T01:19:47.0940439Z inflating: build/bin/c10_bfloat16_test 2022-11-23T01:19:47.0998251Z inflating: build/bin/c10_complex_math_test 2022-11-23T01:19:47.1057448Z inflating: build/bin/c10_complex_test 2022-11-23T01:19:47.1175186Z inflating: build/bin/c10_either_test 2022-11-23T01:19:47.1231113Z inflating: build/bin/c10_exception_test 2022-11-23T01:19:47.1284557Z inflating: build/bin/c10_flags_test 2022-11-23T01:19:47.1467298Z inflating: build/bin/c10_intrusive_ptr_test 2022-11-23T01:19:47.1521232Z inflating: build/bin/c10_irange_test 2022-11-23T01:19:47.1582813Z inflating: build/bin/c10_logging_test 2022-11-23T01:19:47.1662758Z inflating: build/bin/c10_optional_test 2022-11-23T01:19:47.1728992Z inflating: build/bin/c10_ordered_preserving_dict_test 2022-11-23T01:19:47.1787387Z inflating: build/bin/c10_registry_test 2022-11-23T01:19:47.1850808Z inflating: build/bin/c10_string_view_test 2022-11-23T01:19:47.1905750Z inflating: build/bin/c10_tempfile_test 2022-11-23T01:19:47.1966251Z inflating: build/bin/c10_typeid_test 2022-11-23T01:19:47.2026357Z inflating: build/bin/c10_intrusive_ptr_benchmark 2022-11-23T01:19:47.2548630Z inflating: build/bin/protoc-3.13.0.0 2022-11-23T01:19:47.3070337Z inflating: build/bin/protoc 2022-11-23T01:19:47.3122056Z inflating: build/bin/c10_cuda_CUDATest 2022-11-23T01:19:47.3438256Z inflating: build/bin/vec_test_all_types_DEFAULT 2022-11-23T01:19:47.3794071Z inflating: build/bin/vec_test_all_types_AVX2 2022-11-23T01:19:47.3851405Z inflating: build/bin/FileStoreTest 2022-11-23T01:19:47.3908527Z inflating: build/bin/HashStoreTest 2022-11-23T01:19:47.3973187Z inflating: build/bin/TCPStoreTest 2022-11-23T01:19:47.3989020Z inflating: build/bin/ProcessGroupMPITest 2022-11-23T01:19:47.3992521Z inflating: build/bin/example_allreduce 2022-11-23T01:19:47.4048436Z inflating: build/bin/Dimname_test 2022-11-23T01:19:47.4117442Z inflating: build/bin/MaybeOwned_test 2022-11-23T01:19:47.4196018Z inflating: build/bin/Dict_test 2022-11-23T01:19:47.4257174Z inflating: build/bin/NamedTensor_test 2022-11-23T01:19:47.4320838Z inflating: build/bin/apply_utils_test 2022-11-23T01:19:47.4383770Z inflating: build/bin/atest 2022-11-23T01:19:47.4441441Z inflating: build/bin/broadcast_test 2022-11-23T01:19:47.4503801Z inflating: build/bin/cpu_generator_test 2022-11-23T01:19:47.4569972Z inflating: build/bin/basic 2022-11-23T01:19:47.4625689Z inflating: build/bin/cpu_profiling_allocator_test 2022-11-23T01:19:47.4679166Z inflating: build/bin/dispatch_key_set_test 2022-11-23T01:19:47.4732773Z inflating: build/bin/dlconvertor_test 2022-11-23T01:19:47.4827984Z inflating: build/bin/cpu_rng_test 2022-11-23T01:19:47.4890352Z inflating: build/bin/extension_backend_test 2022-11-23T01:19:47.4950755Z inflating: build/bin/half_test 2022-11-23T01:19:47.5052460Z inflating: build/bin/ivalue_test 2022-11-23T01:19:47.5104917Z inflating: build/bin/lazy_tensor_test 2022-11-23T01:19:47.5162614Z inflating: build/bin/math_kernel_test 2022-11-23T01:19:47.5219746Z inflating: build/bin/memory_overlapping_test 2022-11-23T01:19:47.5277314Z inflating: build/bin/memory_format_test 2022-11-23T01:19:47.5332882Z inflating: build/bin/mobile_memory_cleanup 2022-11-23T01:19:47.5387187Z inflating: build/bin/operator_name_test 2022-11-23T01:19:47.5447392Z inflating: build/bin/native_test 2022-11-23T01:19:47.5500985Z inflating: build/bin/operators_test 2022-11-23T01:19:47.5557302Z inflating: build/bin/packedtensoraccessor_test 2022-11-23T01:19:47.5627086Z inflating: build/bin/pow_test 2022-11-23T01:19:47.5688425Z inflating: build/bin/quantized_test 2022-11-23T01:19:47.5741147Z inflating: build/bin/reduce_ops_test 2022-11-23T01:19:47.5795466Z inflating: build/bin/reportMemoryUsage_test 2022-11-23T01:19:47.5855915Z inflating: build/bin/scalar_tensor_test 2022-11-23T01:19:47.5917417Z inflating: build/bin/scalar_test 2022-11-23T01:19:47.5973038Z inflating: build/bin/stride_properties_test 2022-11-23T01:19:47.6058099Z inflating: build/bin/tensor_iterator_test 2022-11-23T01:19:47.6118006Z inflating: build/bin/test_parallel 2022-11-23T01:19:47.6177665Z inflating: build/bin/type_ptr_test 2022-11-23T01:19:47.6180539Z inflating: build/bin/thread_init_test 2022-11-23T01:19:47.6233553Z inflating: build/bin/variant_test 2022-11-23T01:19:47.6298302Z inflating: build/bin/type_test 2022-11-23T01:19:47.6354279Z inflating: build/bin/undefined_tensor_test 2022-11-23T01:19:47.6408600Z inflating: build/bin/weakref_test 2022-11-23T01:19:47.6409850Z inflating: build/bin/verify_api_visibility 2022-11-23T01:19:47.6484696Z inflating: build/bin/vmap_test 2022-11-23T01:19:47.6538889Z inflating: build/bin/wrapdim_test 2022-11-23T01:19:47.6603185Z inflating: build/bin/IListRef_test 2022-11-23T01:19:47.6655339Z inflating: build/bin/xla_tensor_test 2022-11-23T01:19:47.6773777Z inflating: build/bin/List_test 2022-11-23T01:19:47.6904756Z inflating: build/bin/kernel_function_legacy_test 2022-11-23T01:19:47.7008560Z inflating: build/bin/kernel_function_test 2022-11-23T01:19:47.7078622Z inflating: build/bin/KernelFunction_test 2022-11-23T01:19:47.7217213Z inflating: build/bin/kernel_lambda_legacy_test 2022-11-23T01:19:47.7328912Z inflating: build/bin/kernel_lambda_test 2022-11-23T01:19:47.7393182Z inflating: build/bin/kernel_stackbased_test 2022-11-23T01:19:47.7446875Z inflating: build/bin/CppSignature_test 2022-11-23T01:19:47.7550155Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2022-11-23T01:19:47.7601327Z inflating: build/bin/op_allowlist_test 2022-11-23T01:19:47.7661535Z inflating: build/bin/backend_fallback_test 2022-11-23T01:19:47.7975751Z inflating: build/bin/op_registration_test 2022-11-23T01:19:47.8032264Z inflating: build/bin/inline_container_test 2022-11-23T01:19:47.8088271Z inflating: build/bin/cuda_apply_test 2022-11-23T01:19:47.8152598Z inflating: build/bin/cuda_atomic_ops_test 2022-11-23T01:19:47.8209707Z inflating: build/bin/cuda_caching_host_allocator_test 2022-11-23T01:19:47.8282063Z inflating: build/bin/cuda_complex_math_test 2022-11-23T01:19:47.8334210Z inflating: build/bin/cuda_device_test 2022-11-23T01:19:47.8397007Z inflating: build/bin/cuda_complex_test 2022-11-23T01:19:47.8460289Z inflating: build/bin/cuda_cub_test 2022-11-23T01:19:47.8513857Z inflating: build/bin/cuda_dlconvertor_test 2022-11-23T01:19:47.8567593Z inflating: build/bin/cuda_integer_divider_test 2022-11-23T01:19:47.8639657Z inflating: build/bin/cuda_distributions_test 2022-11-23T01:19:47.8702215Z inflating: build/bin/cuda_generator_test 2022-11-23T01:19:47.8755343Z inflating: build/bin/cuda_half_test 2022-11-23T01:19:47.8819948Z inflating: build/bin/cuda_stream_test 2022-11-23T01:19:47.8872239Z inflating: build/bin/cuda_optional_test 2022-11-23T01:19:47.8928172Z inflating: build/bin/cuda_reportMemoryUsage_test 2022-11-23T01:19:47.8982908Z inflating: build/bin/cuda_packedtensoraccessor_test 2022-11-23T01:19:47.9034997Z inflating: build/bin/cuda_cudnn_test 2022-11-23T01:19:47.9091307Z inflating: build/bin/cuda_vectorized_test 2022-11-23T01:19:47.9108873Z inflating: build/bin/tutorial_tensorexpr 2022-11-23T01:19:47.9178794Z inflating: build/bin/ProcessGroupGlooTest 2022-11-23T01:19:47.9241156Z inflating: build/bin/ProcessGroupGlooAsyncTest 2022-11-23T01:19:47.9306930Z inflating: build/bin/ProcessGroupNCCLTest 2022-11-23T01:19:47.9369803Z inflating: build/bin/ProcessGroupNCCLErrorsTest 2022-11-23T01:19:47.9426182Z inflating: build/bin/ProcessGroupUCCTest 2022-11-23T01:19:47.9484253Z inflating: build/bin/test_dist_autograd 2022-11-23T01:19:47.9486829Z inflating: build/bin/parallel_benchmark 2022-11-23T01:19:47.9561056Z inflating: build/bin/test_mobile_nnc 2022-11-23T01:19:47.9636168Z inflating: build/bin/test_cpp_rpc 2022-11-23T01:19:47.9647505Z inflating: build/bin/aot_model_compiler_test 2022-11-23T01:19:48.0562548Z inflating: build/bin/test_tensorexpr 2022-11-23T01:19:48.0568091Z inflating: build/bin/torch_shm_manager 2022-11-23T01:19:48.0954971Z inflating: build/bin/test_lazy 2022-11-23T01:19:48.2281937Z inflating: build/bin/test_api 2022-11-23T01:19:48.3491552Z inflating: build/bin/test_jit 2022-11-23T01:19:48.3493668Z inflating: .pytorch-test-times.json 2022-11-23T01:19:48.3523709Z ##[group]Run df -H 2022-11-23T01:19:48.3523966Z df -H 2022-11-23T01:19:48.3537634Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T01:19:48.3537919Z env: 2022-11-23T01:19:48.3538163Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:19:48.3538445Z GPU_FLAG: --gpus all 2022-11-23T01:19:48.3538677Z ##[endgroup] 2022-11-23T01:19:48.3578575Z Filesystem Size Used Avail Use% Mounted on 2022-11-23T01:19:48.3579315Z devtmpfs 129G 0 129G 0% /dev 2022-11-23T01:19:48.3579767Z tmpfs 129G 0 129G 0% /dev/shm 2022-11-23T01:19:48.3580077Z tmpfs 129G 607k 129G 1% /run 2022-11-23T01:19:48.3581642Z tmpfs 129G 0 129G 0% /sys/fs/cgroup 2022-11-23T01:19:48.3581987Z /dev/xvda1 162G 30G 132G 19% / 2022-11-23T01:19:48.3582270Z tmpfs 26G 0 26G 0% /run/user/0 2022-11-23T01:19:48.3605805Z ##[group]Run .github/scripts/parse_ref.py 2022-11-23T01:19:48.3606163Z .github/scripts/parse_ref.py 2022-11-23T01:19:48.3618460Z shell: /usr/bin/bash -e {0} 2022-11-23T01:19:48.3618716Z env: 2022-11-23T01:19:48.3618959Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:19:48.3619212Z GPU_FLAG: --gpus all 2022-11-23T01:19:48.3619463Z ##[endgroup] 2022-11-23T01:19:48.3911046Z ##[group]Run set -x 2022-11-23T01:19:48.3911435Z set -x 2022-11-23T01:19:48.3911667Z  2022-11-23T01:19:48.3911927Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2022-11-23T01:19:48.3912282Z  TEST_COMMAND=.jenkins/pytorch/multigpu-test.sh 2022-11-23T01:19:48.3912641Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2022-11-23T01:19:48.3912954Z  TEST_COMMAND=.jenkins/caffe2/test.sh 2022-11-23T01:19:48.3913231Z else 2022-11-23T01:19:48.3913510Z  TEST_COMMAND=.jenkins/pytorch/test.sh 2022-11-23T01:19:48.3913785Z fi 2022-11-23T01:19:48.3913985Z  2022-11-23T01:19:48.3914435Z COMMIT_MESSAGES=$(git cherry -v "origin/${GIT_DEFAULT_BRANCH:-master}") 2022-11-23T01:19:48.3914762Z  2022-11-23T01:19:48.3915044Z # sanitize the input commit message and PR body here: 2022-11-23T01:19:48.3915339Z # 2022-11-23T01:19:48.3915725Z # trim all new lines from commit messages + PR_BODY to avoid issues with batch environment 2022-11-23T01:19:48.3916226Z # variable copying. see https://github.com/pytorch/pytorch/pull/80043#issuecomment-1167796028 2022-11-23T01:19:48.3916653Z COMMIT_MESSAGES="${COMMIT_MESSAGES//[$'\n\r']}" 2022-11-23T01:19:48.3916968Z PR_BODY="${PR_BODY//[$'\n\r']}" 2022-11-23T01:19:48.3917208Z  2022-11-23T01:19:48.3917565Z # then trim all special characters like single and double quotes to avoid unescaped inputs to 2022-11-23T01:19:48.3917948Z # wreak havoc internally 2022-11-23T01:19:48.3918276Z export COMMIT_MESSAGES="${COMMIT_MESSAGES//[\'\"]}" 2022-11-23T01:19:48.3918595Z export PR_BODY="${PR_BODY//[\'\"]}" 2022-11-23T01:19:48.3918857Z  2022-11-23T01:19:48.3919174Z # detached container should get cleaned up by teardown_ec2_linux 2022-11-23T01:19:48.3919566Z # TODO: Stop building test binaries as part of the build phase 2022-11-23T01:19:48.3919946Z # Used for GPU_FLAG since that doesn't play nice 2022-11-23T01:19:48.3920280Z # shellcheck disable=SC2086,SC2090 2022-11-23T01:19:48.3920585Z container_name=$(docker run \ 2022-11-23T01:19:48.3920846Z  ${GPU_FLAG:-} \ 2022-11-23T01:19:48.3921119Z  -e BUILD_ENVIRONMENT \ 2022-11-23T01:19:48.3921394Z  -e PR_NUMBER \ 2022-11-23T01:19:48.3921644Z  -e GITHUB_ACTIONS \ 2022-11-23T01:19:48.3921905Z  -e BASE_SHA \ 2022-11-23T01:19:48.3922153Z  -e BRANCH \ 2022-11-23T01:19:48.3922380Z  -e SHA1 \ 2022-11-23T01:19:48.3922642Z  -e AWS_DEFAULT_REGION \ 2022-11-23T01:19:48.3922919Z  -e IN_WHEEL_TEST \ 2022-11-23T01:19:48.3923165Z  -e SHARD_NUMBER \ 2022-11-23T01:19:48.3923427Z  -e TEST_CONFIG \ 2022-11-23T01:19:48.3923694Z  -e NUM_TEST_SHARDS \ 2022-11-23T01:19:48.3923939Z  -e PR_BODY \ 2022-11-23T01:19:48.3924202Z  -e COMMIT_MESSAGES \ 2022-11-23T01:19:48.3924495Z  -e PYTORCH_RETRY_TEST_CASES \ 2022-11-23T01:19:48.3924795Z  -e PYTORCH_OVERRIDE_FLAKY_SIGNAL \ 2022-11-23T01:19:48.3925086Z  -e PR_LABELS \ 2022-11-23T01:19:48.3925376Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2022-11-23T01:19:48.3925667Z  -e SCCACHE_BUCKET \ 2022-11-23T01:19:48.3925931Z  -e SCCACHE_S3_KEY_PREFIX \ 2022-11-23T01:19:48.3926201Z  -e XLA_CUDA \ 2022-11-23T01:19:48.3926487Z  -e XLA_CLANG_CACHE_S3_BUCKET_NAME \ 2022-11-23T01:19:48.3926792Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2022-11-23T01:19:48.3927129Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2022-11-23T01:19:48.3927482Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2022-11-23T01:19:48.3927788Z  --ulimit stack=10485760:83886080 \ 2022-11-23T01:19:48.3928104Z  --security-opt seccomp=unconfined \ 2022-11-23T01:19:48.3928488Z  --cap-add=SYS_PTRACE \ 2022-11-23T01:19:48.3928771Z  --ipc=host \ 2022-11-23T01:19:48.3929025Z  --shm-size="${SHM_SIZE}" \ 2022-11-23T01:19:48.3929285Z  --tty \ 2022-11-23T01:19:48.3929522Z  --detach \ 2022-11-23T01:19:48.3929774Z  --name="${container_name}" \ 2022-11-23T01:19:48.3930049Z  --user jenkins \ 2022-11-23T01:19:48.3930370Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2022-11-23T01:19:48.3930701Z  -w /var/lib/jenkins/workspace \ 2022-11-23T01:19:48.3930984Z  "${DOCKER_IMAGE}" 2022-11-23T01:19:48.3931226Z ) 2022-11-23T01:19:48.3931582Z echo "DOCKER_CONTAINER_ID=${container_name}" >> "${GITHUB_ENV}" 2022-11-23T01:19:48.3932036Z docker exec -t "${container_name}" sh -c "pip install $(echo dist/*.whl)[opt-einsum] && ${TEST_COMMAND}" 2022-11-23T01:19:48.3943974Z shell: /usr/bin/bash -e {0} 2022-11-23T01:19:48.3944225Z env: 2022-11-23T01:19:48.3944469Z GIT_DEFAULT_BRANCH: master 2022-11-23T01:19:48.3944720Z GPU_FLAG: --gpus all 2022-11-23T01:19:48.3945048Z BUILD_ENVIRONMENT: linux-bionic-cuda11.6-py3.10-gcc7 2022-11-23T01:19:48.3945362Z PR_NUMBER: 2022-11-23T01:19:48.3945585Z BRANCH: master 2022-11-23T01:19:48.3945863Z SHA1: 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:19:48.3946180Z BASE_SHA: 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:19:48.3946464Z PYTORCH_RETRY_TEST_CASES: 1 2022-11-23T01:19:48.3946749Z PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1 2022-11-23T01:19:48.3947029Z TEST_CONFIG: distributed 2022-11-23T01:19:48.3947283Z SHARD_NUMBER: 3 2022-11-23T01:19:48.3947507Z NUM_TEST_SHARDS: 3 2022-11-23T01:19:48.3947748Z PR_BODY: 2022-11-23T01:19:48.3948051Z SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2 2022-11-23T01:19:48.3948364Z SCCACHE_S3_KEY_PREFIX: pull 2022-11-23T01:19:48.3948618Z SHM_SIZE: 2g 2022-11-23T01:19:48.3949489Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:19:48.3949966Z XLA_CUDA: 2022-11-23T01:19:48.3950316Z XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla 2022-11-23T01:19:48.3950702Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 0 2022-11-23T01:19:48.3950988Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2022-11-23T01:19:48.3951263Z ##[endgroup] 2022-11-23T01:19:48.3979376Z + [[ distributed == \m\u\l\t\i\g\p\u ]] 2022-11-23T01:19:48.3979887Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *onnx* ]] 2022-11-23T01:19:48.3980239Z + TEST_COMMAND=.jenkins/pytorch/test.sh 2022-11-23T01:19:48.3982982Z ++ git cherry -v origin/master 2022-11-23T01:19:48.3999427Z + COMMIT_MESSAGES= 2022-11-23T01:19:48.3999691Z + COMMIT_MESSAGES= 2022-11-23T01:19:48.3999914Z + PR_BODY= 2022-11-23T01:19:48.4000170Z + export COMMIT_MESSAGES= 2022-11-23T01:19:48.4000430Z + COMMIT_MESSAGES= 2022-11-23T01:19:48.4000678Z + export PR_BODY= 2022-11-23T01:19:48.4000900Z + PR_BODY= 2022-11-23T01:19:48.4009816Z +++ nproc --ignore=2 2022-11-23T01:19:48.4048402Z ++ docker run --gpus all -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e PR_BODY -e COMMIT_MESSAGES -e PYTORCH_RETRY_TEST_CASES -e PYTORCH_OVERRIDE_FLAKY_SIGNAL -e PR_LABELS -e MAX_JOBS=30 -e SCCACHE_BUCKET -e SCCACHE_S3_KEY_PREFIX -e XLA_CUDA -e XLA_CLANG_CACHE_S3_BUCKET_NAME -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS --env-file=/tmp/github_env_3528293562 --ulimit stack=10485760:83886080 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --ipc=host --shm-size=2g --tty --detach --name= --user jenkins -v /home/ec2-user/actions-runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T01:20:02.5974417Z + container_name=08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T01:20:02.5974973Z + echo DOCKER_CONTAINER_ID=08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T01:20:02.5979522Z ++ echo dist/torch-1.14.0a0+git1cfd385-cp310-cp310-linux_x86_64.whl 2022-11-23T01:20:02.5981615Z + docker exec -t 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 sh -c 'pip install dist/torch-1.14.0a0+git1cfd385-cp310-cp310-linux_x86_64.whl[opt-einsum] && .jenkins/pytorch/test.sh' 2022-11-23T01:20:03.1499645Z Processing ./dist/torch-1.14.0a0+git1cfd385-cp310-cp310-linux_x86_64.whl 2022-11-23T01:20:04.1042471Z Requirement already satisfied: networkx in /opt/conda/lib/python3.10/site-packages (from torch==1.14.0a0+git1cfd385) (2.6.3) 2022-11-23T01:20:04.1046214Z Requirement already satisfied: sympy in /opt/conda/lib/python3.10/site-packages (from torch==1.14.0a0+git1cfd385) (1.11.1) 2022-11-23T01:20:04.1050884Z Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.10/site-packages (from torch==1.14.0a0+git1cfd385) (4.4.0) 2022-11-23T01:20:04.1067866Z Requirement already satisfied: opt-einsum>=3.3 in /opt/conda/lib/python3.10/site-packages (from torch==1.14.0a0+git1cfd385) (3.3.0) 2022-11-23T01:20:04.1148172Z Requirement already satisfied: numpy>=1.7 in /opt/conda/lib/python3.10/site-packages (from opt-einsum>=3.3->torch==1.14.0a0+git1cfd385) (1.21.2) 2022-11-23T01:20:04.1369442Z Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.10/site-packages (from sympy->torch==1.14.0a0+git1cfd385) (1.2.1) 2022-11-23T01:20:05.0590502Z Installing collected packages: torch 2022-11-23T01:20:14.7811756Z Successfully installed torch-1.14.0a0+git1cfd385 2022-11-23T01:20:14.8471438Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2022-11-23T01:20:14.8690427Z + TORCH_INSTALL_DIR=/opt/conda/lib/python3.10/site-packages/torch 2022-11-23T01:20:14.8690943Z + TORCH_BIN_DIR=/opt/conda/lib/python3.10/site-packages/torch/bin 2022-11-23T01:20:14.8691422Z + TORCH_LIB_DIR=/opt/conda/lib/python3.10/site-packages/torch/lib 2022-11-23T01:20:14.8694724Z + TORCH_TEST_DIR=/opt/conda/lib/python3.10/site-packages/torch/test 2022-11-23T01:20:14.8695282Z + BUILD_DIR=build 2022-11-23T01:20:14.8695757Z + BUILD_RENAMED_DIR=build_renamed 2022-11-23T01:20:14.8696341Z + BUILD_BIN_DIR=build/bin 2022-11-23T01:20:14.8696883Z + export VALGRIND=ON 2022-11-23T01:20:14.8697381Z + VALGRIND=ON 2022-11-23T01:20:14.8700520Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *clang9* ]] 2022-11-23T01:20:14.8701016Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 != *bazel* ]] 2022-11-23T01:20:14.8701368Z ++ realpath build/custom_test_artifacts 2022-11-23T01:20:14.8704885Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/workspace/build/custom_test_artifacts 2022-11-23T01:20:14.8708655Z ++ dirname .jenkins/pytorch/test.sh 2022-11-23T01:20:14.8717146Z + source .jenkins/pytorch/common.sh 2022-11-23T01:20:14.8720764Z +++ dirname .jenkins/pytorch/common.sh 2022-11-23T01:20:14.8731251Z ++ source .jenkins/pytorch/common_utils.sh 2022-11-23T01:20:14.8733633Z +++ declare -f -t trap_add 2022-11-23T01:20:14.8739914Z ++ set -ex 2022-11-23T01:20:14.8740315Z ++ [[ linux-bionic-cuda11.6-py3.10-gcc7 == *rocm* ]] 2022-11-23T01:20:14.8740643Z ++ BUILD_TEST_LIBTORCH=0 2022-11-23T01:20:14.8741735Z + echo 'Environment variables' 2022-11-23T01:20:14.8742018Z Environment variables 2022-11-23T01:20:14.8742263Z + env 2022-11-23T01:20:14.8750117Z SHARD_NUMBER=3 2022-11-23T01:20:14.8750497Z NV_LIBCUBLAS_DEV_VERSION=11.9.2.110-1 2022-11-23T01:20:14.8750892Z NV_CUDA_COMPAT_PACKAGE=cuda-compat-11-6 2022-11-23T01:20:14.8751260Z LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2022-11-23T01:20:14.8751698Z NV_LIBNCCL_DEV_PACKAGE=libnccl-dev=2.12.10-1+cuda11.6 2022-11-23T01:20:14.8752219Z UCC_HOME=/usr 2022-11-23T01:20:14.8752831Z BUILD_ENVIRONMENT=linux-bionic-cuda11.6-py3.10-gcc7 2022-11-23T01:20:14.8753167Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2022-11-23T01:20:14.8753569Z NV_LIBNPP_DEV_PACKAGE=libnpp-dev-11-6=11.6.3.124-1 2022-11-23T01:20:14.8754130Z INSTALLED_DB=yes 2022-11-23T01:20:14.8754576Z HOSTNAME=08317a7e7676 2022-11-23T01:20:14.8755030Z GITHUB_REF_NAME=master 2022-11-23T01:20:14.8755440Z GITHUB_API_URL=https://api.github.com 2022-11-23T01:20:14.8755745Z OPENSSL_DIR=/opt/openssl 2022-11-23T01:20:14.8756037Z UCC_COMMIT=1c7a7127186e7836f73aafbd7697bbc274a77eee 2022-11-23T01:20:14.8756665Z GITHUB_STEP_SUMMARY=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/step_summary_ce135d09-f910-49d5-b241-54a523bb5e1c 2022-11-23T01:20:14.8757093Z CUDA_PATH=/usr/local/cuda 2022-11-23T01:20:14.8757851Z GITHUB_ACTION_PATH=/home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-linux 2022-11-23T01:20:14.8758774Z GITHUB_RUN_ATTEMPT=1 2022-11-23T01:20:14.8759271Z TEST_CONFIG=distributed 2022-11-23T01:20:14.8759712Z NV_LIBNPP_VERSION=11.6.3.124-1 2022-11-23T01:20:14.8760523Z NV_NVPROF_DEV_PACKAGE=cuda-nvprof-11-6=11.6.124-1 2022-11-23T01:20:14.8760906Z GITHUB_REPOSITORY_OWNER=pytorch 2022-11-23T01:20:14.8761374Z GITHUB_ACTIONS=true 2022-11-23T01:20:14.8761628Z NVIDIA_VISIBLE_DEVICES=all 2022-11-23T01:20:14.8762090Z NV_NVPROF_VERSION=11.6.124-1 2022-11-23T01:20:14.8762495Z NV_LIBCUSPARSE_VERSION=11.7.2.124-1 2022-11-23T01:20:14.8762761Z CI=true 2022-11-23T01:20:14.8762997Z PYTORCH_OVERRIDE_FLAKY_SIGNAL=1 2022-11-23T01:20:14.8763403Z NV_LIBCUBLAS_DEV_PACKAGE=libcublas-dev-11-6=11.9.2.110-1 2022-11-23T01:20:14.8763705Z BRANCH=master 2022-11-23T01:20:14.8763928Z GITHUB_HEAD_REF= 2022-11-23T01:20:14.8764236Z UCX_COMMIT=31e74cac7bee0ef66bef2af72e7d86d9c282e5ab 2022-11-23T01:20:14.8764564Z GITHUB_ACTOR=pytorchmergebot 2022-11-23T01:20:14.8764872Z CMAKE_CUDA_COMPILER_LAUNCHER=/opt/cache/bin/sccache 2022-11-23T01:20:14.8765176Z GITHUB_ACTION_REF= 2022-11-23T01:20:14.8765454Z NCCL_VERSION=2.12.10-1 2022-11-23T01:20:14.8765694Z GITHUB_ACTION=__self 2022-11-23T01:20:14.8765936Z VALGRIND=ON 2022-11-23T01:20:14.8766186Z GITHUB_REF_PROTECTED=true 2022-11-23T01:20:14.8766627Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2022-11-23T01:20:14.8767017Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2022-11-23T01:20:14.8767597Z *** 2022-11-23T01:20:14.8767817Z INSTALLED_VISION=yes 2022-11-23T01:20:14.8768067Z NVARCH=x86_64 2022-11-23T01:20:14.8768378Z NV_LIBCUSPARSE_DEV_VERSION=11.7.2.124-1 2022-11-23T01:20:14.8768643Z HOME=/var/lib/jenkins 2022-11-23T01:20:14.8769175Z GITHUB_STATE=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/save_state_ce135d09-f910-49d5-b241-54a523bb5e1c 2022-11-23T01:20:14.8769590Z CARGO_NET_GIT_FETCH_WITH_CLI=true 2022-11-23T01:20:14.8769873Z GITHUB_ACTION_REPOSITORY= 2022-11-23T01:20:14.8770124Z GITHUB_REF_TYPE=branch 2022-11-23T01:20:14.8770436Z NV_LIBNCCL_PACKAGE_VERSION=2.12.10-1 2022-11-23T01:20:14.8770732Z GITHUB_RETENTION_DAYS=90 2022-11-23T01:20:14.8771103Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2022-11-23T01:20:14.8771518Z NV_LIBNCCL_PACKAGE=libnccl2=2.12.10-1+cuda11.6 2022-11-23T01:20:14.8772072Z GITHUB_ENV=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_env_ce135d09-f910-49d5-b241-54a523bb5e1c 2022-11-23T01:20:14.8772465Z DEBIAN_FRONTEND=noninteractive 2022-11-23T01:20:14.8772822Z NV_LIBNCCL_DEV_PACKAGE_NAME=libnccl-dev 2022-11-23T01:20:14.8773119Z GITHUB_REF=refs/heads/master 2022-11-23T01:20:14.8773402Z NV_CUDA_LIB_VERSION=11.6.2-1 2022-11-23T01:20:14.8773716Z GITHUB_SHA=1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:20:14.8774016Z INSTALLED_PROTOBUF=yes 2022-11-23T01:20:14.8774263Z GITHUB_RUN_ID=3528293562 2022-11-23T01:20:14.8774615Z NV_LIBNPP_PACKAGE=libnpp-11-6=11.6.3.124-1 2022-11-23T01:20:14.8774924Z NV_LIBNCCL_PACKAGE_NAME=libnccl2 2022-11-23T01:20:14.8775210Z LIBRARY_PATH=/usr/local/cuda/lib64/stubs 2022-11-23T01:20:14.8775533Z NV_NVTX_VERSION=11.6.124-1 2022-11-23T01:20:14.8775832Z GITHUB_SERVER_URL=https://github.com 2022-11-23T01:20:14.8776113Z MAX_JOBS=30 2022-11-23T01:20:14.8776383Z NV_LIBCUBLAS_VERSION=11.9.2.110-1 2022-11-23T01:20:14.8776759Z NV_LIBCUBLAS_PACKAGE=libcublas-11-6=11.9.2.110-1 2022-11-23T01:20:14.8777350Z GITHUB_EVENT_PATH=/home/ec2-user/actions-runner/_work/_temp/_github_workflow/event.json 2022-11-23T01:20:14.8777703Z UCX_HOME=/usr 2022-11-23T01:20:14.8777961Z PYTORCH_RETRY_TEST_CASES=1 2022-11-23T01:20:14.8778294Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2022-11-23T01:20:14.8778629Z BASE_SHA=1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:20:14.8778974Z NV_CUDA_CUDART_DEV_VERSION=11.6.55-1 2022-11-23T01:20:14.8779238Z PR_BODY= 2022-11-23T01:20:14.8779451Z GITHUB_BASE_REF= 2022-11-23T01:20:14.8779687Z TERM=xterm 2022-11-23T01:20:14.8779911Z XLA_CUDA= 2022-11-23T01:20:14.8780165Z NV_NVML_DEV_VERSION=11.6.55-1 2022-11-23T01:20:14.8780445Z TORCH_CUDA_ARCH_LIST=Maxwell 2022-11-23T01:20:14.8780778Z CUDA_VERSION=11.6.2 2022-11-23T01:20:14.8781108Z NV_LIBCUBLAS_PACKAGE_NAME=libcublas-11-6 2022-11-23T01:20:14.8781412Z OPENSSL_ROOT_DIR=/opt/openssl 2022-11-23T01:20:14.8781955Z GITHUB_PATH=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_ce135d09-f910-49d5-b241-54a523bb5e1c 2022-11-23T01:20:14.8782333Z GITHUB_JOB=test 2022-11-23T01:20:14.8782590Z SCCACHE_S3_KEY_PREFIX=pull 2022-11-23T01:20:14.8782852Z COMMIT_MESSAGES= 2022-11-23T01:20:14.8783129Z NVIDIA_DRIVER_CAPABILITIES=compute,utility 2022-11-23T01:20:14.8783417Z NUM_TEST_SHARDS=3 2022-11-23T01:20:14.8783654Z PR_NUMBER= 2022-11-23T01:20:14.8784178Z GITHUB_OUTPUT=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_output_ce135d09-f910-49d5-b241-54a523bb5e1c 2022-11-23T01:20:14.8784544Z SHLVL=1 2022-11-23T01:20:14.8784886Z NV_LIBCUBLAS_DEV_PACKAGE_NAME=libcublas-dev-11-6 2022-11-23T01:20:14.8785216Z GITHUB_REPOSITORY=pytorch/pytorch 2022-11-23T01:20:14.8786018Z NVIDIA_REQUIRE_CUDA=cuda>=11.6 brand=tesla,driver>=418,driver<419 brand=tesla,driver>=450,driver<451 brand=tesla,driver>=470,driver<471 brand=unknown,driver>=470,driver<471 brand=nvidia,driver>=470,driver<471 brand=nvidiartx,driver>=470,driver<471 brand=geforce,driver>=470,driver<471 brand=geforcertx,driver>=470,driver<471 brand=quadro,driver>=470,driver<471 brand=quadrortx,driver>=470,driver<471 brand=titan,driver>=470,driver<471 brand=titanrtx,driver>=470,driver<471 2022-11-23T01:20:14.8786903Z NV_LIBNPP_DEV_VERSION=11.6.3.124-1 2022-11-23T01:20:14.8787212Z SHA1=1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T01:20:14.8787500Z GITHUB_EVENT_NAME=push 2022-11-23T01:20:14.8787788Z NV_CUDA_CUDART_VERSION=11.6.55-1 2022-11-23T01:20:14.8788141Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2022-11-23T01:20:14.8788435Z GITHUB_RUN_NUMBER=67324 2022-11-23T01:20:14.8788698Z GITHUB_WORKFLOW=pull 2022-11-23T01:20:14.8789466Z PATH=/opt/cache/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2022-11-23T01:20:14.8789954Z NV_LIBNCCL_DEV_PACKAGE_VERSION=2.12.10-1 2022-11-23T01:20:14.8790399Z GITHUB_WORKSPACE=/home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-11-23T01:20:14.8790753Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2022-11-23T01:20:14.8791032Z _=/usr/bin/env 2022-11-23T01:20:14.8791325Z + echo 'Testing pytorch' 2022-11-23T01:20:14.8791566Z Testing pytorch 2022-11-23T01:20:14.8791840Z + export LANG=C.UTF-8 2022-11-23T01:20:14.8792105Z + LANG=C.UTF-8 2022-11-23T01:20:14.8792319Z + PR_NUMBER= 2022-11-23T01:20:14.8792578Z + [[ distributed == \d\e\f\a\u\l\t ]] 2022-11-23T01:20:14.8792881Z + [[ distributed == \d\i\s\t\r\i\b\u\t\e\d ]] 2022-11-23T01:20:14.8793271Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *rocm* ]] 2022-11-23T01:20:14.8793594Z + [[ distributed == \s\l\o\w ]] 2022-11-23T01:20:14.8794008Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *slow-gradcheck* ]] 2022-11-23T01:20:14.8794440Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *cuda* ]] 2022-11-23T01:20:14.8794795Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2022-11-23T01:20:14.8795120Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2022-11-23T01:20:14.8795418Z + [[ distributed == *crossref* ]] 2022-11-23T01:20:14.8795697Z + [[ distributed == *dynamo* ]] 2022-11-23T01:20:14.8795975Z + [[ distributed == *inductor* ]] 2022-11-23T01:20:14.8796471Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *rocm* ]] 2022-11-23T01:20:14.8796921Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 != *-bazel-* ]] 2022-11-23T01:20:14.8797303Z + pip_install --user ninja==1.10.2 2022-11-23T01:20:14.8797697Z + pip install --progress-bar off --user ninja==1.10.2 2022-11-23T01:20:15.4224021Z Collecting ninja==1.10.2 2022-11-23T01:20:15.4431131Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2022-11-23T01:20:16.3107758Z Installing collected packages: ninja 2022-11-23T01:20:16.3209843Z  WARNING: The script ninja is installed in '/var/lib/jenkins/.local/bin' which is not on PATH. 2022-11-23T01:20:16.3210530Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-11-23T01:20:16.3274465Z Successfully installed ninja-1.10.2 2022-11-23T01:20:16.3986235Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2022-11-23T01:20:16.3986900Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2022-11-23T01:20:16.3987881Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *asan* ]] 2022-11-23T01:20:16.3988330Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *-tsan* ]] 2022-11-23T01:20:16.3988692Z + [[ distributed == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2022-11-23T01:20:16.3989464Z + [[ distributed == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2022-11-23T01:20:16.3996157Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *tbb* ]] 2022-11-23T01:20:16.4011218Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *libtorch* ]] 2022-11-23T01:20:16.4011677Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *-bazel-* ]] 2022-11-23T01:20:16.4012119Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *-tsan* ]] 2022-11-23T01:20:16.4014580Z + cd test 2022-11-23T01:20:16.4014970Z + python -c 'import torch; print(torch.__config__.show())' 2022-11-23T01:20:18.0586197Z PyTorch built with: 2022-11-23T01:20:18.0586692Z - GCC 7.5 2022-11-23T01:20:18.0587012Z - C++ Version: 201402 2022-11-23T01:20:18.0587561Z - Intel(R) oneAPI Math Kernel Library Version 2022.0-Product Build 20211112 for Intel(R) 64 architecture applications 2022-11-23T01:20:18.0588112Z - Intel(R) MKL-DNN v2.7.0 (Git Hash 650085b2f3643aad05c629425983491d63b5c289) 2022-11-23T01:20:18.0588516Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2022-11-23T01:20:18.0588895Z - LAPACK is enabled (usually provided by MKL) 2022-11-23T01:20:18.0589557Z - NNPACK is enabled 2022-11-23T01:20:18.0589880Z - CPU capability usage: AVX2 2022-11-23T01:20:18.0590191Z - CUDA Runtime 11.6 2022-11-23T01:20:18.0590586Z - NVCC architecture flags: -gencode;arch=compute_52,code=sm_52 2022-11-23T01:20:18.0590994Z - CuDNN 8.3.2 (built against CUDA 11.5) 2022-11-23T01:20:18.0591297Z - Magma 2.6.1 2022-11-23T01:20:18.0594636Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.6, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Werror -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=1.14.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 2022-11-23T01:20:18.0596893Z 2022-11-23T01:20:18.3066524Z + cd test 2022-11-23T01:20:18.3067115Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2022-11-23T01:20:19.8957998Z ATen/Parallel: 2022-11-23T01:20:19.8958360Z at::get_num_threads() : 16 2022-11-23T01:20:19.8958642Z at::get_num_interop_threads() : 16 2022-11-23T01:20:19.8958942Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2022-11-23T01:20:19.8959240Z omp_get_max_threads() : 16 2022-11-23T01:20:19.8959892Z Intel(R) oneAPI Math Kernel Library Version 2022.0-Product Build 20211112 for Intel(R) 64 architecture applications 2022-11-23T01:20:19.8960604Z mkl_get_max_threads() : 16 2022-11-23T01:20:19.8961049Z Intel(R) MKL-DNN v2.7.0 (Git Hash 650085b2f3643aad05c629425983491d63b5c289) 2022-11-23T01:20:19.8961412Z std::thread::hardware_concurrency() : 32 2022-11-23T01:20:19.8961704Z Environment variables: 2022-11-23T01:20:19.8961979Z OMP_NUM_THREADS : [not set] 2022-11-23T01:20:19.8962249Z MKL_NUM_THREADS : [not set] 2022-11-23T01:20:19.8962510Z ATen parallel backend: OpenMP 2022-11-23T01:20:19.8962693Z 2022-11-23T01:20:20.1336131Z + [[ distributed == *backward* ]] 2022-11-23T01:20:20.1336469Z + [[ distributed == *xla* ]] 2022-11-23T01:20:20.1336754Z + [[ distributed == \j\i\t\_\l\e\g\a\c\y ]] 2022-11-23T01:20:20.1337338Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *libtorch* ]] 2022-11-23T01:20:20.1337686Z + [[ distributed == distributed ]] 2022-11-23T01:20:20.1337942Z + install_filelock 2022-11-23T01:20:20.1338196Z + pip_install filelock 2022-11-23T01:20:20.1338572Z + pip install --progress-bar off filelock 2022-11-23T01:20:20.6531663Z Collecting filelock 2022-11-23T01:20:20.6721609Z Downloading filelock-3.8.0-py3-none-any.whl (10 kB) 2022-11-23T01:20:21.5842209Z Installing collected packages: filelock 2022-11-23T01:20:21.6241569Z Successfully installed filelock-3.8.0 2022-11-23T01:20:21.6908469Z + install_triton 2022-11-23T01:20:21.6908837Z + local commit 2022-11-23T01:20:21.6909586Z + [[ distributed == *rocm* ]] 2022-11-23T01:20:21.6913657Z ++ get_pinned_commit triton 2022-11-23T01:20:21.6913968Z ++ cat .github/ci_commit_pins/triton.txt 2022-11-23T01:20:21.6929101Z + commit=0d7e7532279e45672555e344646f5c19c3972331 2022-11-23T01:20:21.6930023Z + pip_install --user git+https://github.com/openai/triton@0d7e7532279e45672555e344646f5c19c3972331#subdirectory=python 2022-11-23T01:20:21.6930757Z + pip install --progress-bar off --user git+https://github.com/openai/triton@0d7e7532279e45672555e344646f5c19c3972331#subdirectory=python 2022-11-23T01:20:22.1793408Z Collecting git+https://github.com/openai/triton@0d7e7532279e45672555e344646f5c19c3972331#subdirectory=python 2022-11-23T01:20:22.1799434Z Cloning https://github.com/openai/triton (to revision 0d7e7532279e45672555e344646f5c19c3972331) to /tmp/pip-req-build-agp4kt9w 2022-11-23T01:20:22.1820224Z Running command git clone --filter=blob:none --quiet https://github.com/openai/triton /tmp/pip-req-build-agp4kt9w 2022-11-23T01:20:22.9407078Z Running command git rev-parse -q --verify 'sha^0d7e7532279e45672555e344646f5c19c3972331' 2022-11-23T01:20:22.9427168Z Running command git fetch -q https://github.com/openai/triton 0d7e7532279e45672555e344646f5c19c3972331 2022-11-23T01:20:23.3499313Z Running command git checkout -q 0d7e7532279e45672555e344646f5c19c3972331 2022-11-23T01:20:23.5478323Z Resolved https://github.com/openai/triton to commit 0d7e7532279e45672555e344646f5c19c3972331 2022-11-23T01:20:23.5479420Z Running command git submodule update --init --recursive -q 2022-11-23T01:20:24.1692953Z Preparing metadata (setup.py) ... [?25l- done 2022-11-23T01:20:24.3882981Z [?25hCollecting cmake 2022-11-23T01:20:24.4111061Z Downloading cmake-3.25.0-py2.py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (23.7 MB) 2022-11-23T01:20:24.8220379Z Requirement already satisfied: filelock in /opt/conda/lib/python3.10/site-packages (from triton==2.0.0) (3.8.0) 2022-11-23T01:20:24.8222811Z Requirement already satisfied: torch in /opt/conda/lib/python3.10/site-packages (from triton==2.0.0) (1.14.0a0+git1cfd385) 2022-11-23T01:20:24.8479952Z Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.10/site-packages (from torch->triton==2.0.0) (4.4.0) 2022-11-23T01:20:24.8484984Z Requirement already satisfied: sympy in /opt/conda/lib/python3.10/site-packages (from torch->triton==2.0.0) (1.11.1) 2022-11-23T01:20:24.8490034Z Requirement already satisfied: networkx in /opt/conda/lib/python3.10/site-packages (from torch->triton==2.0.0) (2.6.3) 2022-11-23T01:20:24.8703215Z Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.10/site-packages (from sympy->torch->triton==2.0.0) (1.2.1) 2022-11-23T01:20:24.8776013Z Building wheels for collected packages: triton 2022-11-23T01:21:17.4521573Z Building wheel for triton (setup.py) ... [?25l- \ | / - \ | / - \ | done 2022-11-23T01:21:17.5005887Z [?25h Created wheel for triton: filename=triton-2.0.0-cp310-cp310-linux_x86_64.whl size=15377935 sha256=29ec2c3f3d92aff6e885c562f8b78a9dd4a238dd0d859c26e76f0b70b72ab85b 2022-11-23T01:21:17.5008346Z Stored in directory: /var/lib/jenkins/.cache/pip/wheels/3f/1d/23/1c2bc47d618a44f9c949aea4b7e355e737a1f1ed208f009295 2022-11-23T01:21:17.5027084Z Successfully built triton 2022-11-23T01:21:18.3630543Z Installing collected packages: cmake, triton 2022-11-23T01:21:20.1774464Z Successfully installed cmake-3.25.0 triton-2.0.0 2022-11-23T01:21:20.2868612Z + pip_install --user jinja2 2022-11-23T01:21:20.2869430Z + pip install --progress-bar off --user jinja2 2022-11-23T01:21:21.3143747Z Collecting jinja2 2022-11-23T01:21:21.3355517Z Downloading Jinja2-3.1.2-py3-none-any.whl (133 kB) 2022-11-23T01:21:21.6031066Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.10/site-packages (from jinja2) (2.1.1) 2022-11-23T01:21:22.5171097Z Installing collected packages: jinja2 2022-11-23T01:21:22.6253256Z Successfully installed jinja2-3.1.2 2022-11-23T01:21:22.6979117Z + test_distributed 2022-11-23T01:21:22.6979619Z + echo 'Testing distributed python tests' 2022-11-23T01:21:22.6979919Z Testing distributed python tests 2022-11-23T01:21:22.6980375Z + python test/run_test.py --distributed-tests --shard 3 3 --verbose 2022-11-23T01:21:24.9359074Z Ignoring disabled issues: [] 2022-11-23T01:21:24.9762199Z /var/lib/jenkins/workspace/test/run_test.py:1134: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. 2022-11-23T01:21:24.9762799Z if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": 2022-11-23T01:21:24.9766712Z Found test time stats from artifacts 2022-11-23T01:21:24.9782365Z Selected tests: 2022-11-23T01:21:24.9782704Z distributed/algorithms/quantization/test_quantization 2022-11-23T01:21:24.9783055Z distributed/test_distributed_spawn 2022-11-23T01:21:24.9783390Z distributed/pipeline/sync/test_worker 2022-11-23T01:21:24.9783710Z distributed/pipeline/sync/test_pipeline 2022-11-23T01:21:24.9784041Z distributed/pipeline/sync/test_microbatch 2022-11-23T01:21:24.9784394Z distributed/pipeline/sync/test_deferred_batch_norm 2022-11-23T01:21:24.9787354Z distributed/pipeline/sync/test_bugs 2022-11-23T01:21:24.9788002Z distributed/pipeline/sync/skip/test_tracker 2022-11-23T01:21:24.9788356Z distributed/pipeline/sync/skip/test_leak 2022-11-23T01:21:24.9788688Z distributed/pipeline/sync/skip/test_api 2022-11-23T01:21:24.9789316Z distributed/elastic/timer/api_test 2022-11-23T01:21:24.9789667Z distributed/checkpoint/test_dedup_tensors 2022-11-23T01:21:24.9790020Z distributed/_shard/sharded_tensor/ops/test_math_ops 2022-11-23T01:21:24.9790362Z distributed/_composable/test_checkpoint 2022-11-23T01:21:24.9793702Z distributed/checkpoint/test_utils 2022-11-23T01:21:24.9794019Z distributed/fsdp/test_utils 2022-11-23T01:21:24.9794380Z distributed/_shard/sharded_optim/test_sharded_optim 2022-11-23T01:21:24.9794714Z distributed/test_data_parallel 2022-11-23T01:21:24.9795213Z distributed/elastic/utils/distributed_test 2022-11-23T01:21:24.9795556Z distributed/fsdp/test_fsdp_uneven 2022-11-23T01:21:24.9795870Z distributed/fsdp/test_fsdp_pure_fp16 2022-11-23T01:21:24.9796192Z distributed/_shard/sharded_tensor/ops/test_softmax 2022-11-23T01:21:24.9796546Z distributed/_shard/sharded_tensor/ops/test_chunk 2022-11-23T01:21:24.9796870Z distributed/test_c10d_error_logger 2022-11-23T01:21:24.9797192Z distributed/_shard/sharding_spec/test_sharding_spec 2022-11-23T01:21:24.9797523Z distributed/fsdp/test_fsdp_input 2022-11-23T01:21:24.9797870Z distributed/_shard/sharded_tensor/ops/test_elementwise_ops 2022-11-23T01:21:24.9798204Z distributed/_shard/test_partial_tensor 2022-11-23T01:21:24.9798629Z distributed/_tensor/test_math_ops 2022-11-23T01:21:24.9798977Z distributed/_tensor/parallel/test_tp_examples 2022-11-23T01:21:24.9799302Z distributed/fsdp/test_fsdp_memory 2022-11-23T01:21:24.9799595Z distributed/_tensor/test_pointwise_ops 2022-11-23T01:21:24.9799926Z distributed/fsdp/test_fsdp_tp_integration 2022-11-23T01:21:24.9800257Z distributed/fsdp/test_fsdp_clip_grad_norm 2022-11-23T01:21:24.9800589Z distributed/_shard/sharded_tensor/ops/test_matrix_ops 2022-11-23T01:21:24.9800915Z distributed/test_c10d_spawn_ucc 2022-11-23T01:21:24.9801214Z distributed/_tensor/test_matrix_ops 2022-11-23T01:21:24.9801521Z distributed/fsdp/test_fsdp_flatten_params 2022-11-23T01:21:24.9801828Z distributed/test_c10d_spawn_gloo 2022-11-23T01:21:24.9802121Z distributed/test_c10d_spawn_nccl 2022-11-23T01:21:24.9802402Z distributed/_tensor/test_device_mesh 2022-11-23T01:21:24.9802701Z distributed/test_pg_wrapper 2022-11-23T01:21:24.9802994Z distributed/fsdp/test_fsdp_comm_hooks 2022-11-23T01:21:24.9803338Z distributed/optim/test_zero_redundancy_optimizer 2022-11-23T01:21:24.9803658Z distributed/fsdp/test_fsdp_optim_state 2022-11-23T01:21:24.9803956Z distributed/test_c10d_gloo 2022-11-23T01:21:24.9804242Z distributed/fsdp/test_fsdp_core 2022-11-23T01:21:24.9808386Z Prioritized test from test file changes. 2022-11-23T01:21:24.9808709Z reordering tests for PR: 2022-11-23T01:21:24.9808974Z prioritized: [] 2022-11-23T01:21:24.9813410Z the rest: ['distributed/algorithms/quantization/test_quantization', 'distributed/test_distributed_spawn', 'distributed/pipeline/sync/test_worker', 'distributed/pipeline/sync/test_pipeline', 'distributed/pipeline/sync/test_microbatch', 'distributed/pipeline/sync/test_deferred_batch_norm', 'distributed/pipeline/sync/test_bugs', 'distributed/pipeline/sync/skip/test_tracker', 'distributed/pipeline/sync/skip/test_leak', 'distributed/pipeline/sync/skip/test_api', 'distributed/elastic/timer/api_test', 'distributed/checkpoint/test_dedup_tensors', 'distributed/_shard/sharded_tensor/ops/test_math_ops', 'distributed/_composable/test_checkpoint', 'distributed/checkpoint/test_utils', 'distributed/fsdp/test_utils', 'distributed/_shard/sharded_optim/test_sharded_optim', 'distributed/test_data_parallel', 'distributed/elastic/utils/distributed_test', 'distributed/fsdp/test_fsdp_uneven', 'distributed/fsdp/test_fsdp_pure_fp16', 'distributed/_shard/sharded_tensor/ops/test_softmax', 'distributed/_shard/sharded_tensor/ops/test_chunk', 'distributed/test_c10d_error_logger', 'distributed/_shard/sharding_spec/test_sharding_spec', 'distributed/fsdp/test_fsdp_input', 'distributed/_shard/sharded_tensor/ops/test_elementwise_ops', 'distributed/_shard/test_partial_tensor', 'distributed/_tensor/test_math_ops', 'distributed/_tensor/parallel/test_tp_examples', 'distributed/fsdp/test_fsdp_memory', 'distributed/_tensor/test_pointwise_ops', 'distributed/fsdp/test_fsdp_tp_integration', 'distributed/fsdp/test_fsdp_clip_grad_norm', 'distributed/_shard/sharded_tensor/ops/test_matrix_ops', 'distributed/test_c10d_spawn_ucc', 'distributed/_tensor/test_matrix_ops', 'distributed/fsdp/test_fsdp_flatten_params', 'distributed/test_c10d_spawn_gloo', 'distributed/test_c10d_spawn_nccl', 'distributed/_tensor/test_device_mesh', 'distributed/test_pg_wrapper', 'distributed/fsdp/test_fsdp_comm_hooks', 'distributed/optim/test_zero_redundancy_optimizer', 'distributed/fsdp/test_fsdp_optim_state', 'distributed/test_c10d_gloo', 'distributed/fsdp/test_fsdp_core'] 2022-11-23T01:21:24.9816405Z 2022-11-23T01:21:24.9816969Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/slow-tests.json to /var/lib/jenkins/workspace/test/.pytorch-slow-tests.json 2022-11-23T01:21:25.0034093Z Downloading https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2022-11-23T01:21:25.0210793Z parallel (file granularity) tests: 2022-11-23T01:21:25.0211057Z 2022-11-23T01:21:25.0211456Z serial (file granularity) tests: 2022-11-23T01:21:25.0211818Z distributed/algorithms/quantization/test_quantization 2022-11-23T01:21:25.0212146Z distributed/test_distributed_spawn 2022-11-23T01:21:25.0212464Z distributed/pipeline/sync/test_worker 2022-11-23T01:21:25.0212792Z distributed/pipeline/sync/test_pipeline 2022-11-23T01:21:25.0213132Z distributed/pipeline/sync/test_microbatch 2022-11-23T01:21:25.0213468Z distributed/pipeline/sync/test_deferred_batch_norm 2022-11-23T01:21:25.0213808Z distributed/pipeline/sync/test_bugs 2022-11-23T01:21:25.0214139Z distributed/pipeline/sync/skip/test_tracker 2022-11-23T01:21:25.0214454Z distributed/pipeline/sync/skip/test_leak 2022-11-23T01:21:25.0214784Z distributed/pipeline/sync/skip/test_api 2022-11-23T01:21:25.0215119Z distributed/elastic/timer/api_test 2022-11-23T01:21:25.0215424Z distributed/checkpoint/test_dedup_tensors 2022-11-23T01:21:25.0215773Z distributed/_shard/sharded_tensor/ops/test_math_ops 2022-11-23T01:21:25.0216121Z distributed/_composable/test_checkpoint 2022-11-23T01:21:25.0216444Z distributed/checkpoint/test_utils 2022-11-23T01:21:25.0216721Z distributed/fsdp/test_utils 2022-11-23T01:21:25.0217049Z distributed/_shard/sharded_optim/test_sharded_optim 2022-11-23T01:21:25.0217374Z distributed/test_data_parallel 2022-11-23T01:21:25.0217679Z distributed/elastic/utils/distributed_test 2022-11-23T01:21:25.0217998Z distributed/fsdp/test_fsdp_uneven 2022-11-23T01:21:25.0218306Z distributed/fsdp/test_fsdp_pure_fp16 2022-11-23T01:21:25.0218627Z distributed/_shard/sharded_tensor/ops/test_softmax 2022-11-23T01:21:25.0218985Z distributed/_shard/sharded_tensor/ops/test_chunk 2022-11-23T01:21:25.0219311Z distributed/test_c10d_error_logger 2022-11-23T01:21:25.0219632Z distributed/_shard/sharding_spec/test_sharding_spec 2022-11-23T01:21:25.0219963Z distributed/fsdp/test_fsdp_input 2022-11-23T01:21:25.0220311Z distributed/_shard/sharded_tensor/ops/test_elementwise_ops 2022-11-23T01:21:25.0220643Z distributed/_shard/test_partial_tensor 2022-11-23T01:21:25.0220956Z distributed/_tensor/test_math_ops 2022-11-23T01:21:25.0221283Z distributed/_tensor/parallel/test_tp_examples 2022-11-23T01:21:25.0221650Z distributed/fsdp/test_fsdp_memory 2022-11-23T01:21:25.0221955Z distributed/_tensor/test_pointwise_ops 2022-11-23T01:21:25.0222285Z distributed/fsdp/test_fsdp_tp_integration 2022-11-23T01:21:25.0222614Z distributed/fsdp/test_fsdp_clip_grad_norm 2022-11-23T01:21:25.0222944Z distributed/_shard/sharded_tensor/ops/test_matrix_ops 2022-11-23T01:21:25.0223274Z distributed/test_c10d_spawn_ucc 2022-11-23T01:21:25.0223574Z distributed/_tensor/test_matrix_ops 2022-11-23T01:21:25.0223877Z distributed/fsdp/test_fsdp_flatten_params 2022-11-23T01:21:25.0224191Z distributed/test_c10d_spawn_gloo 2022-11-23T01:21:25.0224483Z distributed/test_c10d_spawn_nccl 2022-11-23T01:21:25.0224764Z distributed/_tensor/test_device_mesh 2022-11-23T01:21:25.0225056Z distributed/test_pg_wrapper 2022-11-23T01:21:25.0225350Z distributed/fsdp/test_fsdp_comm_hooks 2022-11-23T01:21:25.0270760Z distributed/optim/test_zero_redundancy_optimizer 2022-11-23T01:21:25.0271182Z distributed/fsdp/test_fsdp_optim_state 2022-11-23T01:21:25.0271711Z distributed/test_c10d_gloo 2022-11-23T01:21:25.0272229Z distributed/fsdp/test_fsdp_core 2022-11-23T01:21:27.2432515Z Ignoring disabled issues: [] 2022-11-23T01:21:27.2672259Z Ignoring disabled issues: [] 2022-11-23T01:21:27.6908719Z Running distributed/algorithms/quantization/test_quantization ... [2022-11-23 01:21:27.690383] 2022-11-23T01:21:27.6917302Z /usr/bin/mpiexec 2022-11-23T01:21:27.6918439Z MPI not available -- MPI backend tests will be skipped 2022-11-23T01:21:27.6919656Z Map different backends to different shards for distributed/algorithms/quantization/test_quantization: {'gloo': 1, 'nccl': 2} 2022-11-23T01:21:27.6920461Z Shard 3: test should be run in 1 2022-11-23T01:21:27.6920995Z Shard 3: nccl should be run in 2 2022-11-23T01:21:27.6921488Z Shard 3: gloo should be run in 1 2022-11-23T01:21:27.6922025Z Shard 3: ucc should be run in 1 2022-11-23T01:21:27.6923843Z Running distributed/test_distributed_spawn ... [2022-11-23 01:21:27.692096] 2022-11-23T01:21:27.6931857Z /usr/bin/mpiexec 2022-11-23T01:21:27.6933065Z MPI not available -- MPI backend tests will be skipped 2022-11-23T01:21:27.6934132Z Map different backends to different shards for distributed/test_distributed_spawn: {'gloo': 1, 'nccl': 2, 'ucc': 3} 2022-11-23T01:21:27.6934855Z Shard 3: test should be run in 1 2022-11-23T01:21:27.6935349Z Shard 3: nccl should be run in 2 2022-11-23T01:21:27.6935849Z Shard 3: gloo should be run in 1 2022-11-23T01:21:27.6940074Z Running distributed tests for the ucc backend with env init_method in shard 3 of 3 2022-11-23T01:21:27.6945958Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 01:21:27.694283] 2022-11-23T01:46:55.9135585Z 2022-11-23T01:46:55.9138002Z Expand the folded group to see the log file of distributed/test_distributed_spawn 2022-11-23T01:46:55.9144731Z ##[group]PRINTING LOG FILE of distributed/test_distributed_spawn (/var/lib/jenkins/workspace/test/test-reports/distributed-test_distributed_spawn_ceaarsv3) 2022-11-23T01:46:55.9149598Z 2022-11-23T01:46:55.9194460Z , <__main__.TestDistBackendWithSpawn testMethod=test_3_level_hierarchical_model_averager>, <__main__.TestDistBackendWithSpawn testMethod=test_Backend_enum_class>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallelCPU>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallelCPU_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_2D_Input>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Channels_Last>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_No_Affine>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_non_default_stream>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_requires_grad>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_with_amp_and_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedSampler_padding>, <__main__.TestDistBackendWithSpawn testMethod=test_SyncBatchNorm_process_group>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync_allreduce_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync_allreduce_with_then_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_simple>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_with_empty>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_into_cat_tensor_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_into_stack_tensor_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_multigpu_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_object_default_pg>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_object_subgroup>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_v_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_max_complex_unsupported>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_complex_unsupported_ops>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_multigpu_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_result_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_async>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_cuda_async>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_average_parameters>, <__main__.TestDistBackendWithSpawn testMethod=test_backend_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_backend_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_timeout_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_timeout_global>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_timeout_group>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_gloo>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_gloo_tags>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_mixed_backend_err>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_no_rank_zero_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_op_err>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_op_list_err>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_ring_exchange_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_self_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_tensor_err>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_group>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_object_list>, <__main__.TestDistBackendWithSpawn testMethod=test_compute_bucket_assignment_by_size_sparse_error_with_logger>, <__main__.TestDistBackendWithSpawn testMethod=test_compute_bucket_assignment_by_size_sparse_error_without_logger>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_broadcast_buffer>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_broadcast_buffer_via_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_buffer_hook_allreduce>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_buffer_hook_allreduce_return_future>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_build_debug_param_to_name_mapping>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_build_debug_param_to_name_mapping_requires_grad>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_comm_hook_logging>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_control_flow_different_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_control_flow_same_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_create_graph>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_device>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_forward_backward_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_grad_div_uneven_inputs>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_allreduce>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_allreduce_process_group>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_post_localSGD>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_powerSGD>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_pickling_powerSGD>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adam_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adam_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_ignore_params_arg>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_inference>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_join_model_equivalence>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_logging_data_cpu>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_logging_data_gpu>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_model_diff_num_params_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_model_diff_shape_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_multiple_nested_unused_params_err_ignore_params>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_multiple_nested_unused_params_error>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_namedtuple>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_new_tensor_in_fwd>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_new_tensor_in_fwd_static_graph>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_profiling_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_profiling_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_python_error_logged>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_returns_tensor_with_no_grad>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_shared_grad_acc_unused_params>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_static_graph_nested_types>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_sync_bn_training_vs_eval>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_sync_module_states>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_input_exception>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_input_join_disable>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_inputs>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_inputs_stop_iteration_sync_bn>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_unused_params_rebuild_buckets_exception>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_zero_output_features>, <__main__.TestDistBackendWithSpawn testMethod=test_destroy_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_destroy_group>, <__main__.TestDistBackendWithSpawn testMethod=test_detect_ddp_is_actually_static>, <__main__.TestDistBackendWithSpawn testMethod=test_different_graph_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_dump_DDP_relevant_env_vars>, <__main__.TestDistBackendWithSpawn testMethod=test_gather>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_checks>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_group>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_object>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_object_subgroup>, <__main__.TestDistBackendWithSpawn testMethod=test_get_backend>, <__main__.TestDistBackendWithSpawn testMethod=test_get_future>, <__main__.TestDistBackendWithSpawn testMethod=test_get_rank>, <__main__.TestDistBackendWithSpawn testMethod=test_get_rank_size_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_get_rank_size_group>, <__main__.TestDistBackendWithSpawn testMethod=test_invalid_static_graph>, <__main__.TestDistBackendWithSpawn testMethod=test_irecv>, <__main__.TestDistBackendWithSpawn testMethod=test_isend>, <__main__.TestDistBackendWithSpawn testMethod=test_isend_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_isend_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_allreduce_hang>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_allreduce_hang_wait_all_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_failure_order>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_gloo>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_gloo_rank_0_timeout>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_gloo_subgroup>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_wait_all_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_allgather>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_allreduce>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_broadcast>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_reduce>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_high_priority_stream>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_by_enumeration>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_by_enumeration_input_rank_exceeds_world_size>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_by_enumeration_negative_input_rank>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_group_size_exceeds_world_size>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_overlap_not_allowed>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_world_size_not_divisible_by_group_size>, <__main__.TestDistBackendWithSpawn testMethod=test_output_unused_in_loss_dict_module>, <__main__.TestDistBackendWithSpawn testMethod=test_output_unused_in_loss_tuple_module>, <__main__.TestDistBackendWithSpawn testMethod=test_periodic_model_averager>, <__main__.TestDistBackendWithSpawn testMethod=test_periodic_model_averager_param_group>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity_with_hierarchical_sgd>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_step_reload>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_max>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_min>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_product>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_scatter_tensor_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_scatter_v_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum_cuda_twice>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum_twice>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_checks>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_group>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_object_list>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_any_source>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_any_source_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_any_source_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_nccl_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_nccl_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_with_tag>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_with_tag_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_with_tag_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_sparse_all_reduce_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_sparse_all_reduce_sum_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_stateless_api_with_ddp>, <__main__.TestDistBackendWithSpawn testMethod=test_static_graph_api_cpu>, <__main__.TestDistBackendWithSpawn testMethod=test_sync_bn_logged>, <__main__.TestDistBackendWithSpawn testMethod=test_undefined_grad_parity_unused_parameters>, <__main__.TestDistBackendWithSpawn testMethod=test_verify_model_across_rank_with_logger>, <__main__.TestDistBackendWithSpawn testMethod=test_verify_model_across_rank_without_logger>]> 2022-11-23T01:46:55.9232297Z test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9232831Z test_3_level_hierarchical_model_averager (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9233273Z test_Backend_enum_class (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9233705Z test_DistributedDataParallel (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9234146Z test_DistributedDataParallelCPU (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9234635Z test_DistributedDataParallelCPU_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9235134Z test_DistributedDataParallel_SyncBatchNorm (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9235621Z test_DistributedDataParallel_SyncBatchNorm_2D_Input (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9236150Z test_DistributedDataParallel_SyncBatchNorm_Channels_Last (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9236716Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9237296Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9237822Z test_DistributedDataParallel_SyncBatchNorm_No_Affine (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9238372Z test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9238909Z test_DistributedDataParallel_non_default_stream (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9239386Z test_DistributedDataParallel_requires_grad (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9239892Z test_DistributedDataParallel_with_amp_and_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9240377Z test_DistributedSampler_padding (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9240823Z test_SyncBatchNorm_process_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9241240Z test_accumulate_gradients_no_sync (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9241712Z test_accumulate_gradients_no_sync_allreduce_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9242207Z test_accumulate_gradients_no_sync_allreduce_with_then_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9242672Z test_accumulate_gradients_no_sync_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9243096Z test_all_gather (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9243505Z test_all_gather_coalesced_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9243944Z test_all_gather_coalesced_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9244358Z test_all_gather_coalesced_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9244793Z test_all_gather_coalesced_simple (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9245229Z test_all_gather_coalesced_with_empty (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9245633Z test_all_gather_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9246179Z test_all_gather_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9246599Z test_all_gather_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9247016Z test_all_gather_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9247396Z test_all_gather_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9247812Z test_all_gather_into_cat_tensor_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9248250Z test_all_gather_into_stack_tensor_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9248655Z test_all_gather_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9249075Z test_all_gather_multigpu_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9249596Z test_all_gather_object_default_pg (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9250010Z test_all_gather_object_subgroup (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9250418Z test_all_gather_v_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9250847Z test_all_reduce_coalesced_full_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9251300Z test_all_reduce_coalesced_full_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9251738Z test_all_reduce_coalesced_full_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9252203Z test_all_reduce_coalesced_full_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9252644Z test_all_reduce_coalesced_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9253061Z test_all_reduce_coalesced_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9253507Z test_all_reduce_coalesced_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9253978Z test_all_reduce_coalesced_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9254638Z test_all_reduce_coalesced_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9255081Z test_all_reduce_coalesced_max_complex_unsupported (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9255538Z test_all_reduce_coalesced_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9255973Z test_all_reduce_coalesced_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9256374Z test_all_reduce_coalesced_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9256810Z test_all_reduce_complex_unsupported_ops (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9257247Z test_all_reduce_full_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9257664Z test_all_reduce_full_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9258071Z test_all_reduce_full_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9258501Z test_all_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9258916Z test_all_reduce_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9259300Z test_all_reduce_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9259715Z test_all_reduce_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9260128Z test_all_reduce_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9260519Z test_all_reduce_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9260883Z test_all_reduce_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9261281Z test_all_reduce_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9261700Z test_all_reduce_multigpu_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9262099Z test_all_reduce_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9262507Z test_all_reduce_result_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9262901Z test_all_reduce_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9263276Z test_all_reduce_sum_async (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9263683Z test_all_reduce_sum_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9264157Z test_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9264570Z test_all_reduce_sum_cuda_async (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9264976Z test_all_reduce_sum_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9265374Z test_all_to_all (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9265756Z test_all_to_all_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9266123Z test_all_to_all_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9266517Z test_all_to_all_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9266923Z test_all_to_all_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9267314Z test_all_to_all_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9267780Z test_all_to_all_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9268177Z test_all_to_all_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9268587Z test_all_to_all_single_equal_split (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9269567Z test_all_to_all_single_equal_split_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9270061Z test_all_to_all_single_equal_split_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9270519Z test_all_to_all_single_equal_split_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9270968Z test_all_to_all_single_equal_split_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9271434Z test_all_to_all_single_equal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9271896Z test_all_to_all_single_equal_split_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9272347Z test_all_to_all_single_equal_split_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9272778Z test_all_to_all_single_unequal_split (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9273226Z test_all_to_all_single_unequal_split_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9273677Z test_all_to_all_single_unequal_split_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9274140Z test_all_to_all_single_unequal_split_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9274590Z test_all_to_all_single_unequal_split_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9275062Z test_all_to_all_single_unequal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9275527Z test_all_to_all_single_unequal_split_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9275967Z test_all_to_all_single_unequal_split_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9276404Z test_average_parameters (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9276811Z test_backend_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9277180Z test_backend_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9277553Z test_barrier (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9277930Z test_barrier_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9278320Z test_barrier_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9278701Z test_barrier_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9279099Z test_barrier_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9279488Z test_barrier_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9279873Z test_barrier_timeout_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9280291Z test_barrier_timeout_global (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9280696Z test_barrier_timeout_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9281084Z test_batch_isend_irecv_gloo (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9281502Z test_batch_isend_irecv_gloo_tags (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9281937Z test_batch_isend_irecv_mixed_backend_err (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9282443Z test_batch_isend_irecv_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9282872Z test_batch_isend_irecv_no_rank_zero_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9283296Z test_batch_isend_irecv_op_err (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9283718Z test_batch_isend_irecv_op_list_err (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9284145Z test_batch_isend_irecv_ring_exchange_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9284582Z test_batch_isend_irecv_self_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9285004Z test_batch_isend_irecv_tensor_err (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9285397Z test_broadcast (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9285838Z test_broadcast_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9286240Z test_broadcast_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9286640Z test_broadcast_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9287024Z test_broadcast_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9287430Z test_broadcast_object_list (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9287896Z test_compute_bucket_assignment_by_size_sparse_error_with_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9288424Z test_compute_bucket_assignment_by_size_sparse_error_without_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9288870Z test_ddp_broadcast_buffer (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9289290Z test_ddp_broadcast_buffer_via_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9289714Z test_ddp_buffer_hook_allreduce (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9290146Z test_ddp_buffer_hook_allreduce_return_future (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9290608Z test_ddp_build_debug_param_to_name_mapping (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9291081Z test_ddp_build_debug_param_to_name_mapping_requires_grad (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9291533Z test_ddp_comm_hook_logging (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9291952Z test_ddp_control_flow_different_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9292405Z test_ddp_control_flow_same_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9292822Z test_ddp_create_graph (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9293185Z test_ddp_device (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9293590Z test_ddp_forward_backward_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9294010Z test_ddp_grad_div_uneven_inputs (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9294419Z test_ddp_hook_parity_allreduce (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9294866Z test_ddp_hook_parity_allreduce_process_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9295313Z test_ddp_hook_parity_post_localSGD (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9295742Z test_ddp_hook_parity_powerSGD (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9296145Z test_ddp_hook_pickling_powerSGD (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9296623Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9297137Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9297717Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9298328Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9298956Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9299619Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9300249Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9300836Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9301443Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9302045Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9302654Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9303149Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9303605Z test_ddp_ignore_params_arg (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9304002Z test_ddp_inference (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9304410Z test_ddp_join_model_equivalence (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9304813Z test_ddp_logging_data_cpu (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9305214Z test_ddp_logging_data_gpu (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9305648Z test_ddp_model_diff_num_params_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9306122Z test_ddp_model_diff_shape_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9306816Z test_ddp_multiple_nested_unused_params_err_ignore_params (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9307321Z test_ddp_multiple_nested_unused_params_error (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9307748Z test_ddp_namedtuple (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9308129Z test_ddp_new_tensor_in_fwd (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9308554Z test_ddp_new_tensor_in_fwd_static_graph (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9309542Z test_ddp_profiling_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9310025Z test_ddp_profiling_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9310455Z test_ddp_python_error_logged (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9310881Z test_ddp_returns_tensor_with_no_grad (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9311331Z test_ddp_shared_grad_acc_unused_params (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9311773Z test_ddp_static_graph_nested_types (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9312207Z test_ddp_sync_bn_training_vs_eval (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9312624Z test_ddp_sync_module_states (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9313024Z test_ddp_uneven_input_exception (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9313465Z test_ddp_uneven_input_join_disable (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9313880Z test_ddp_uneven_inputs (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9314291Z test_ddp_uneven_inputs_stop_iteration_sync_bn (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9314773Z test_ddp_unused_params_rebuild_buckets_exception (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9315219Z test_ddp_zero_output_features (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9315626Z test_destroy_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9315997Z test_destroy_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9316411Z test_detect_ddp_is_actually_static (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9316848Z test_different_graph_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9317250Z test_dump_DDP_relevant_env_vars (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9317727Z test_gather (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9318115Z test_gather_checks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9318502Z test_gather_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9318870Z test_gather_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9319253Z test_gather_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9319633Z test_gather_object (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9320012Z test_gather_object_subgroup (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9320399Z test_get_backend (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9320766Z test_get_future (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9321181Z test_get_rank (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9321562Z test_get_rank_size_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9321959Z test_get_rank_size_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9322340Z test_invalid_static_graph (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9322711Z test_irecv (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9323060Z test_isend (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9323443Z test_isend_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9323834Z test_isend_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9324250Z test_monitored_barrier_allreduce_hang (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9324710Z test_monitored_barrier_allreduce_hang_wait_all_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9325149Z test_monitored_barrier_failure_order (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9325571Z test_monitored_barrier_gloo (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9325996Z test_monitored_barrier_gloo_rank_0_timeout (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9326437Z test_monitored_barrier_gloo_subgroup (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9326856Z test_monitored_barrier_wait_all_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9327283Z test_nccl_backend_bool_allgather (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9327705Z test_nccl_backend_bool_allreduce (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9328105Z test_nccl_backend_bool_broadcast (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9328520Z test_nccl_backend_bool_reduce (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9328935Z test_nccl_high_priority_stream (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9329317Z test_new_subgroups (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9329717Z test_new_subgroups_by_enumeration (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9330192Z test_new_subgroups_by_enumeration_input_rank_exceeds_world_size (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9330686Z test_new_subgroups_by_enumeration_negative_input_rank (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9331144Z test_new_subgroups_group_size_exceeds_world_size (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9331598Z test_new_subgroups_overlap_not_allowed (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9332063Z test_new_subgroups_world_size_not_divisible_by_group_size (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9332509Z test_output_unused_in_loss_dict_module (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9332941Z test_output_unused_in_loss_tuple_module (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9333366Z test_periodic_model_averager (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9333798Z test_periodic_model_averager_param_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9334226Z test_post_localSGD_optimizer_parity (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9334678Z test_post_localSGD_optimizer_parity_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9335211Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9335719Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9336208Z test_post_localSGD_optimizer_step_reload (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9336631Z test_reduce_full_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9337029Z test_reduce_full_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9337418Z test_reduce_full_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9337824Z test_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9338213Z test_reduce_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9338634Z test_reduce_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9339022Z test_reduce_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9339412Z test_reduce_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9339770Z test_reduce_max (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9340137Z test_reduce_min (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9340513Z test_reduce_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9340895Z test_reduce_product (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9341278Z test_reduce_scatter_tensor_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9341685Z test_reduce_scatter_v_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9342063Z test_reduce_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9342420Z test_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9342811Z test_reduce_sum_cuda_twice (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9343198Z test_reduce_sum_twice (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9343550Z test_scatter (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9343955Z test_scatter_checks (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9344338Z test_scatter_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9344717Z test_scatter_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9345086Z test_scatter_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9345485Z test_scatter_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9345869Z test_scatter_group (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9346238Z test_scatter_object_list (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9346612Z test_send_recv (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9346987Z test_send_recv_any_source (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9347399Z test_send_recv_any_source_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9347850Z test_send_recv_any_source_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9348281Z test_send_recv_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9348681Z test_send_recv_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9349693Z test_send_recv_nccl_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9350147Z test_send_recv_nccl_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9350568Z test_send_recv_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9350952Z test_send_recv_with_tag (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9351373Z test_send_recv_with_tag_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9351815Z test_send_recv_with_tag_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9352228Z test_sparse_all_reduce_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9352624Z test_sparse_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9353033Z test_stateless_api_with_ddp (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9353435Z test_static_graph_api_cpu (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9353883Z test_sync_bn_logged (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9354321Z test_undefined_grad_parity_unused_parameters (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9354778Z test_verify_model_across_rank_with_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9355208Z test_verify_model_across_rank_without_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9355951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9356414Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9356999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9357537Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9357771Z 2022-11-23T01:46:55.9357881Z Running tests... 2022-11-23T01:46:55.9358299Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9358819Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9359415Z test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9359987Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 909 2022-11-23T01:46:55.9360436Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 910 2022-11-23T01:46:55.9361031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9361495Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9362080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9362560Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9363133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9363587Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9364169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9364627Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9365083Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9365584Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9366258Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9366945Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9367476Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9367957Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9368485Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:46:55.9369313Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:46:55.9369999Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:46:55.9370839Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:46:55.9371596Z [1669166498.392985] [08317a7e7676:909 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9372155Z [1669166498.396176] [08317a7e7676:910 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9372646Z [1669166498.406010] [08317a7e7676:909 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9373124Z [1669166498.406010] [08317a7e7676:909 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9373653Z [1669166498.408088] [08317a7e7676:910 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9374121Z [1669166498.408088] [08317a7e7676:910 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9374630Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:46:55.9375478Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:46:55.9376154Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:46:55.9376984Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:46:55.9377639Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:46:55.9378472Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:46:55.9379141Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:46:55.9379971Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:46:55.9380625Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:46:55.9381455Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:46:55.9382121Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T01:46:55.9382948Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T01:46:55.9383424Z ok (7.483s) 2022-11-23T01:46:55.9383575Z 2022-11-23T01:46:55.9383843Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9384169Z Ran 1 test in 7.483s 2022-11-23T01:46:55.9384330Z 2022-11-23T01:46:55.9384423Z OK 2022-11-23T01:46:55.9384538Z 2022-11-23T01:46:55.9384662Z Generating XML reports... 2022-11-23T01:46:55.9385270Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012131.xml 2022-11-23T01:46:55.9386002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9386445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9387032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9387507Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9387739Z 2022-11-23T01:46:55.9387898Z Running tests... 2022-11-23T01:46:55.9388293Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9388846Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9389934Z test_3_level_hierarchical_model_averager (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.004s) 2022-11-23T01:46:55.9390248Z 2022-11-23T01:46:55.9390506Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9390837Z Ran 1 test in 0.004s 2022-11-23T01:46:55.9390999Z 2022-11-23T01:46:55.9391107Z OK (skipped=1) 2022-11-23T01:46:55.9391355Z 2022-11-23T01:46:55.9391480Z Generating XML reports... 2022-11-23T01:46:55.9392074Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012141.xml 2022-11-23T01:46:55.9392803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9393260Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9393826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9394300Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9394530Z 2022-11-23T01:46:55.9394639Z Running tests... 2022-11-23T01:46:55.9395046Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9395556Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9396075Z test_Backend_enum_class (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9396571Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1057 2022-11-23T01:46:55.9397010Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1058 2022-11-23T01:46:55.9397630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9398087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9398666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9399124Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9399707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9400158Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9400742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9401198Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9401661Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9402164Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9402815Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9403512Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9404043Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9404522Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9404857Z ok (4.357s) 2022-11-23T01:46:55.9405006Z 2022-11-23T01:46:55.9405275Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9405601Z Ran 1 test in 4.357s 2022-11-23T01:46:55.9405761Z 2022-11-23T01:46:55.9405908Z OK 2022-11-23T01:46:55.9406053Z 2022-11-23T01:46:55.9406178Z Generating XML reports... 2022-11-23T01:46:55.9406791Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012144.xml 2022-11-23T01:46:55.9407803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9408249Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9409063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9409541Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9409851Z 2022-11-23T01:46:55.9409970Z Running tests... 2022-11-23T01:46:55.9410367Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9410899Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9411485Z test_DistributedDataParallel (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9412540Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77317 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.604s) 2022-11-23T01:46:55.9413094Z 2022-11-23T01:46:55.9413342Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9413669Z Ran 1 test in 1.604s 2022-11-23T01:46:55.9413832Z 2022-11-23T01:46:55.9413945Z OK (skipped=1) 2022-11-23T01:46:55.9414100Z 2022-11-23T01:46:55.9414226Z Generating XML reports... 2022-11-23T01:46:55.9414814Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012151.xml 2022-11-23T01:46:55.9415541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9415997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9416558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9417032Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9417264Z 2022-11-23T01:46:55.9417373Z Running tests... 2022-11-23T01:46:55.9417776Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9418287Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9418838Z test_DistributedDataParallelCPU (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9419363Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1194 2022-11-23T01:46:55.9419803Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1195 2022-11-23T01:46:55.9420416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9420869Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9421450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9421911Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9422491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9422946Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9423526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9424044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9424510Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9425012Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9425661Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9426362Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9426890Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9427460Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphvr41jxv 2022-11-23T01:46:55.9427987Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphvr41jxv/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9428505Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9429489Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoa4mno08 2022-11-23T01:46:55.9430073Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoa4mno08/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9430576Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9431069Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9431601Z [1669166519.226306] [08317a7e7676:1194 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9432102Z [1669166520.653037] [08317a7e7676:1194 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9432585Z [1669166520.653037] [08317a7e7676:1194 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9433097Z [1669166519.226333] [08317a7e7676:1195 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9433599Z [1669166520.682295] [08317a7e7676:1195 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9434051Z [1669166520.682295] [08317a7e7676:1195 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9434396Z ok (6.213s) 2022-11-23T01:46:55.9434549Z 2022-11-23T01:46:55.9434835Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9435168Z Ran 1 test in 6.213s 2022-11-23T01:46:55.9435314Z 2022-11-23T01:46:55.9435408Z OK 2022-11-23T01:46:55.9435540Z 2022-11-23T01:46:55.9435664Z Generating XML reports... 2022-11-23T01:46:55.9436276Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012155.xml 2022-11-23T01:46:55.9436986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9437446Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9438024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9438499Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9438731Z 2022-11-23T01:46:55.9438822Z Running tests... 2022-11-23T01:46:55.9439228Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9439762Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9440324Z test_DistributedDataParallelCPU_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9440932Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1308 2022-11-23T01:46:55.9441396Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1309 2022-11-23T01:46:55.9442015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9442455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9443034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9443510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9444089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9444593Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9445179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9445648Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9446089Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9446592Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9447260Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9447957Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9448477Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9449055Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqzkur0e0 2022-11-23T01:46:55.9449607Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqzkur0e0/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9450124Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9450612Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsfap2ch6 2022-11-23T01:46:55.9451161Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsfap2ch6/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9451683Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9452157Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9452684Z [1669166528.063793] [08317a7e7676:1308 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9453199Z [1669166529.478826] [08317a7e7676:1308 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9453681Z [1669166529.478826] [08317a7e7676:1308 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9454177Z [1669166528.064187] [08317a7e7676:1309 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9454678Z [1669166529.501932] [08317a7e7676:1309 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9455143Z [1669166529.501932] [08317a7e7676:1309 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9455481Z ok (6.253s) 2022-11-23T01:46:55.9455634Z 2022-11-23T01:46:55.9455893Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9456224Z Ran 1 test in 6.253s 2022-11-23T01:46:55.9456386Z 2022-11-23T01:46:55.9456479Z OK 2022-11-23T01:46:55.9456611Z 2022-11-23T01:46:55.9456735Z Generating XML reports... 2022-11-23T01:46:55.9457381Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012204.xml 2022-11-23T01:46:55.9458114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9458574Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9459137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9459641Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9459876Z 2022-11-23T01:46:55.9459984Z Running tests... 2022-11-23T01:46:55.9460445Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9460960Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9461522Z test_DistributedDataParallel_SyncBatchNorm (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9462056Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1422 2022-11-23T01:46:55.9462485Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1423 2022-11-23T01:46:55.9463094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9463544Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9464128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9464588Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9465174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9465623Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9466190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9466666Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9467125Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9467626Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9468276Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9469355Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9469977Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9470461Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9470956Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9f747i71 2022-11-23T01:46:55.9471498Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9f747i71/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9472035Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvun5v04x 2022-11-23T01:46:55.9472577Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvun5v04x/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9473116Z [1669166538.284867] [08317a7e7676:1423 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9473629Z [1669166538.298351] [08317a7e7676:1423 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9474103Z [1669166538.298351] [08317a7e7676:1423 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9474697Z [1669166538.280564] [08317a7e7676:1422 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9475196Z [1669166538.294442] [08317a7e7676:1422 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9475664Z [1669166538.294442] [08317a7e7676:1422 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9476145Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9476624Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9477178Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9477669Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9478154Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9478616Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9479096Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9479575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9480052Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9480515Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9480991Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9481473Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9481805Z ok (7.368s) 2022-11-23T01:46:55.9481954Z 2022-11-23T01:46:55.9482244Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9482577Z Ran 1 test in 7.368s 2022-11-23T01:46:55.9482739Z 2022-11-23T01:46:55.9482831Z OK 2022-11-23T01:46:55.9482946Z 2022-11-23T01:46:55.9483070Z Generating XML reports... 2022-11-23T01:46:55.9483682Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012212.xml 2022-11-23T01:46:55.9484410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9484851Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9485435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9485916Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9486151Z 2022-11-23T01:46:55.9486260Z Running tests... 2022-11-23T01:46:55.9486648Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9487176Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9487742Z test_DistributedDataParallel_SyncBatchNorm_2D_Input (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9488268Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1540 2022-11-23T01:46:55.9488719Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1541 2022-11-23T01:46:55.9489334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9489788Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9490358Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9490832Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9491481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9491923Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9492501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9492970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9493427Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9493912Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9494630Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9495335Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9495867Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9496328Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9496832Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5k5suw_2 2022-11-23T01:46:55.9497379Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5k5suw_2/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9497901Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcidcxnqs 2022-11-23T01:46:55.9498447Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcidcxnqs/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9499007Z [1669166548.117974] [08317a7e7676:1541 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9499523Z [1669166548.130602] [08317a7e7676:1541 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9499980Z [1669166548.130602] [08317a7e7676:1541 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9500495Z [1669166548.110564] [08317a7e7676:1540 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9500995Z [1669166548.124455] [08317a7e7676:1540 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9501459Z [1669166548.124455] [08317a7e7676:1540 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9501929Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9502422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9502911Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9503397Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9503727Z ok (6.247s) 2022-11-23T01:46:55.9503873Z 2022-11-23T01:46:55.9504147Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9504477Z Ran 1 test in 6.247s 2022-11-23T01:46:55.9504638Z 2022-11-23T01:46:55.9504712Z OK 2022-11-23T01:46:55.9504846Z 2022-11-23T01:46:55.9504971Z Generating XML reports... 2022-11-23T01:46:55.9505582Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012222.xml 2022-11-23T01:46:55.9506312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9506752Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9507386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9507871Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9508102Z 2022-11-23T01:46:55.9508211Z Running tests... 2022-11-23T01:46:55.9508600Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9509647Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9510239Z test_DistributedDataParallel_SyncBatchNorm_Channels_Last (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9510774Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1658 2022-11-23T01:46:55.9511367Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1659 2022-11-23T01:46:55.9511996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9512460Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9513025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9513500Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9514083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9514515Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9515095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9515566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9516024Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9516512Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9517179Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9517879Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9518412Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9518874Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9519375Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpodxsano4 2022-11-23T01:46:55.9519924Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpodxsano4/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9520448Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7t0y1xjd 2022-11-23T01:46:55.9520993Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7t0y1xjd/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9521553Z [1669166556.927295] [08317a7e7676:1658 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9522061Z [1669166556.941105] [08317a7e7676:1658 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9522520Z [1669166556.941105] [08317a7e7676:1658 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9523037Z [1669166556.930405] [08317a7e7676:1659 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9523557Z [1669166556.943823] [08317a7e7676:1659 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9524108Z [1669166556.943823] [08317a7e7676:1659 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9524589Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9525086Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9525572Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9526059Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9526523Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9527004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9527539Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9528000Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9528355Z ok (6.348s) 2022-11-23T01:46:55.9528506Z 2022-11-23T01:46:55.9528784Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9529116Z Ran 1 test in 6.349s 2022-11-23T01:46:55.9529260Z 2022-11-23T01:46:55.9529354Z OK 2022-11-23T01:46:55.9529488Z 2022-11-23T01:46:55.9529611Z Generating XML reports... 2022-11-23T01:46:55.9530217Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012231.xml 2022-11-23T01:46:55.9530924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9531381Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9531969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9532444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9532658Z 2022-11-23T01:46:55.9532769Z Running tests... 2022-11-23T01:46:55.9533174Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9533707Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9534286Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9534862Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1776 2022-11-23T01:46:55.9535318Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1777 2022-11-23T01:46:55.9535933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9536374Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9536963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9537439Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9538022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9538457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9539029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9539501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9539946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9540455Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9541123Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9541873Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9542396Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9542874Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9543378Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnsj4zeok 2022-11-23T01:46:55.9543927Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnsj4zeok/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9544453Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuukg3kfz 2022-11-23T01:46:55.9545046Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuukg3kfz/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9545611Z [1669166565.765837] [08317a7e7676:1777 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9546104Z [1669166565.779086] [08317a7e7676:1777 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9546580Z [1669166565.779086] [08317a7e7676:1777 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9547091Z [1669166565.758644] [08317a7e7676:1776 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9547596Z [1669166565.772415] [08317a7e7676:1776 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9548049Z [1669166565.772415] [08317a7e7676:1776 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9548531Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9549427Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9549888Z ok (6.353s) 2022-11-23T01:46:55.9550038Z 2022-11-23T01:46:55.9550304Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9550636Z Ran 1 test in 6.353s 2022-11-23T01:46:55.9550799Z 2022-11-23T01:46:55.9550893Z OK 2022-11-23T01:46:55.9551027Z 2022-11-23T01:46:55.9551132Z Generating XML reports... 2022-11-23T01:46:55.9551741Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012240.xml 2022-11-23T01:46:55.9552461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9552925Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9553492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9553969Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9554200Z 2022-11-23T01:46:55.9554309Z Running tests... 2022-11-23T01:46:55.9554713Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9555224Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9555820Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9556391Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 1894 2022-11-23T01:46:55.9556826Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 1895 2022-11-23T01:46:55.9557442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9557899Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9558565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9559044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9559632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9560082Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9560642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9561116Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9561646Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9562152Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9562804Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9563507Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9564041Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9564523Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9565016Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmjpl2yct 2022-11-23T01:46:55.9565560Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmjpl2yct/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9566102Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp83tt8yqu 2022-11-23T01:46:55.9566623Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp83tt8yqu/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9567179Z [1669166574.667169] [08317a7e7676:1894 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9567690Z [1669166574.680567] [08317a7e7676:1894 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9568163Z [1669166574.680567] [08317a7e7676:1894 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9568658Z [1669166574.669490] [08317a7e7676:1895 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9569161Z [1669166574.683119] [08317a7e7676:1895 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9569623Z [1669166574.683119] [08317a7e7676:1895 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9570104Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9570578Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9571067Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9571554Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9571903Z ok (7.054s) 2022-11-23T01:46:55.9572033Z 2022-11-23T01:46:55.9572309Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9572644Z Ran 1 test in 7.054s 2022-11-23T01:46:55.9572808Z 2022-11-23T01:46:55.9572906Z OK 2022-11-23T01:46:55.9573040Z 2022-11-23T01:46:55.9573147Z Generating XML reports... 2022-11-23T01:46:55.9573760Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012249.xml 2022-11-23T01:46:55.9574529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9574995Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9575563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9576036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9576269Z 2022-11-23T01:46:55.9576379Z Running tests... 2022-11-23T01:46:55.9576769Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9577302Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9577926Z test_DistributedDataParallel_SyncBatchNorm_No_Affine (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9578475Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2012 2022-11-23T01:46:55.9578915Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2013 2022-11-23T01:46:55.9579529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9579985Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9580563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9581023Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9581606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9582061Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9582622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9583097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9583555Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9584059Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9584730Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9585429Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9585959Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9586442Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9586932Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4yj920x7 2022-11-23T01:46:55.9587477Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4yj920x7/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9588017Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprb5xe4km 2022-11-23T01:46:55.9588544Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprb5xe4km/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9589277Z [1669166584.237311] [08317a7e7676:2013 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9589795Z [1669166584.250030] [08317a7e7676:2013 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9590274Z [1669166584.250030] [08317a7e7676:2013 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9590778Z [1669166584.232937] [08317a7e7676:2012 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9591351Z [1669166584.246563] [08317a7e7676:2012 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9591836Z [1669166584.246563] [08317a7e7676:2012 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9592318Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9592795Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9593282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9593769Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9594195Z ok (6.674s) 2022-11-23T01:46:55.9594328Z 2022-11-23T01:46:55.9594608Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9594935Z Ran 1 test in 6.674s 2022-11-23T01:46:55.9595101Z 2022-11-23T01:46:55.9595195Z OK 2022-11-23T01:46:55.9595330Z 2022-11-23T01:46:55.9595438Z Generating XML reports... 2022-11-23T01:46:55.9596052Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012258.xml 2022-11-23T01:46:55.9596775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9597232Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9597796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9598271Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9598504Z 2022-11-23T01:46:55.9598615Z Running tests... 2022-11-23T01:46:55.9599003Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9599541Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9600135Z test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9600704Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2130 2022-11-23T01:46:55.9601137Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2131 2022-11-23T01:46:55.9601746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9602205Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9602774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9603252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9603845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9604300Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9604857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9605333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9605844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9606348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9607000Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9607707Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9608304Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9608794Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9609283Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpju80g0ur 2022-11-23T01:46:55.9609828Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpju80g0ur/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9610369Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7rp8m9e0 2022-11-23T01:46:55.9610893Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7rp8m9e0/_remote_module_non_scriptable.py 2022-11-23T01:46:55.9611550Z [1669166593.426334] [08317a7e7676:2131 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9612062Z [1669166593.439616] [08317a7e7676:2131 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9612540Z [1669166593.439616] [08317a7e7676:2131 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9613039Z [1669166593.423630] [08317a7e7676:2130 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9613542Z [1669166593.437207] [08317a7e7676:2130 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9614009Z [1669166593.437207] [08317a7e7676:2130 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9614492Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9614973Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9615462Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9615949Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:55.9616302Z ok (6.164s) 2022-11-23T01:46:55.9616432Z 2022-11-23T01:46:55.9616708Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9617043Z Ran 1 test in 6.164s 2022-11-23T01:46:55.9617207Z 2022-11-23T01:46:55.9617300Z OK 2022-11-23T01:46:55.9617435Z 2022-11-23T01:46:55.9617540Z Generating XML reports... 2022-11-23T01:46:55.9618150Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012308.xml 2022-11-23T01:46:55.9618875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9619339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9619900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9620382Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9620613Z 2022-11-23T01:46:55.9620724Z Running tests... 2022-11-23T01:46:55.9621112Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9621650Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9622214Z test_DistributedDataParallel_non_default_stream (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9623296Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/76428 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.645s) 2022-11-23T01:46:55.9623834Z 2022-11-23T01:46:55.9624082Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9624464Z Ran 1 test in 1.645s 2022-11-23T01:46:55.9624634Z 2022-11-23T01:46:55.9624743Z OK (skipped=1) 2022-11-23T01:46:55.9624899Z 2022-11-23T01:46:55.9625026Z Generating XML reports... 2022-11-23T01:46:55.9625617Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012316.xml 2022-11-23T01:46:55.9626338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9626797Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9627361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9663093Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9663351Z 2022-11-23T01:46:55.9663464Z Running tests... 2022-11-23T01:46:55.9663902Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9664454Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9665024Z test_DistributedDataParallel_requires_grad (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9665550Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2282 2022-11-23T01:46:55.9666009Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2283 2022-11-23T01:46:55.9666628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9667084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9667661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9668135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9668724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9669505Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9670080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9670558Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9671022Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9671510Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9672179Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9672882Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9673424Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9673888Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9674229Z ok (4.313s) 2022-11-23T01:46:55.9674377Z 2022-11-23T01:46:55.9674649Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9674957Z Ran 1 test in 4.313s 2022-11-23T01:46:55.9675121Z 2022-11-23T01:46:55.9675216Z OK 2022-11-23T01:46:55.9675349Z 2022-11-23T01:46:55.9675472Z Generating XML reports... 2022-11-23T01:46:55.9676079Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012321.xml 2022-11-23T01:46:55.9676791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9677248Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9677964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9678443Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9678678Z 2022-11-23T01:46:55.9678789Z Running tests... 2022-11-23T01:46:55.9679199Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9679736Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9680292Z test_DistributedDataParallel_with_amp_and_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9681384Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77294 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.611s) 2022-11-23T01:46:55.9681993Z 2022-11-23T01:46:55.9682271Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9682601Z Ran 1 test in 1.612s 2022-11-23T01:46:55.9682765Z 2022-11-23T01:46:55.9682854Z OK (skipped=1) 2022-11-23T01:46:55.9683011Z 2022-11-23T01:46:55.9683133Z Generating XML reports... 2022-11-23T01:46:55.9683738Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012327.xml 2022-11-23T01:46:55.9684461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9684902Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9685486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9685961Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9686191Z 2022-11-23T01:46:55.9686285Z Running tests... 2022-11-23T01:46:55.9686693Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9687223Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9687758Z test_DistributedSampler_padding (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9688255Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2419 2022-11-23T01:46:55.9688706Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2420 2022-11-23T01:46:55.9689317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9689767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9690345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9690822Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9691408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9691836Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9692408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9692877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9693331Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9693821Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9694490Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9695240Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9695766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9696244Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9696758Z [1669166617.353934] [08317a7e7676:2420 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9697264Z [1669166617.367368] [08317a7e7676:2420 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9697719Z [1669166617.367368] [08317a7e7676:2420 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9698290Z [1669166617.351143] [08317a7e7676:2419 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9698794Z [1669166617.364966] [08317a7e7676:2419 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9699261Z [1669166617.364966] [08317a7e7676:2419 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9699586Z ok (6.138s) 2022-11-23T01:46:55.9699733Z 2022-11-23T01:46:55.9700008Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9700341Z Ran 1 test in 6.138s 2022-11-23T01:46:55.9700505Z 2022-11-23T01:46:55.9700595Z OK 2022-11-23T01:46:55.9700710Z 2022-11-23T01:46:55.9700832Z Generating XML reports... 2022-11-23T01:46:55.9701441Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012332.xml 2022-11-23T01:46:55.9702162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9702608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9703189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9703667Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9703899Z 2022-11-23T01:46:55.9704005Z Running tests... 2022-11-23T01:46:55.9704389Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9704917Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9705428Z test_SyncBatchNorm_process_group (__main__.TestDistBackendWithSpawn) ... skip: no torchvision (0.002s) 2022-11-23T01:46:55.9705722Z 2022-11-23T01:46:55.9705971Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9706296Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9706457Z 2022-11-23T01:46:55.9706565Z OK (skipped=1) 2022-11-23T01:46:55.9706721Z 2022-11-23T01:46:55.9706851Z Generating XML reports... 2022-11-23T01:46:55.9707438Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012340.xml 2022-11-23T01:46:55.9708153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9708612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9709439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9709922Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9710164Z 2022-11-23T01:46:55.9710275Z Running tests... 2022-11-23T01:46:55.9710686Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9711236Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9711791Z test_accumulate_gradients_no_sync (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9712315Z Runs _test_accumulate_gradients_no_sync using default inputs ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T01:46:55.9712623Z 2022-11-23T01:46:55.9712892Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9713203Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9713365Z 2022-11-23T01:46:55.9713474Z OK (skipped=1) 2022-11-23T01:46:55.9713631Z 2022-11-23T01:46:55.9713755Z Generating XML reports... 2022-11-23T01:46:55.9714342Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012343.xml 2022-11-23T01:46:55.9715149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9715609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9716196Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9716657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9716888Z 2022-11-23T01:46:55.9716993Z Running tests... 2022-11-23T01:46:55.9717392Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9717902Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9718382Z test_accumulate_gradients_no_sync_allreduce_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9718913Z Runs multiple iterations on _test_accumulate_gradients_no_sync ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T01:46:55.9719224Z 2022-11-23T01:46:55.9719486Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9719794Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9719951Z 2022-11-23T01:46:55.9720057Z OK (skipped=1) 2022-11-23T01:46:55.9720214Z 2022-11-23T01:46:55.9720339Z Generating XML reports... 2022-11-23T01:46:55.9720927Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012345.xml 2022-11-23T01:46:55.9721652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9722113Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9722690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9723149Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9723391Z 2022-11-23T01:46:55.9723499Z Running tests... 2022-11-23T01:46:55.9723900Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9724412Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9724918Z test_accumulate_gradients_no_sync_allreduce_with_then_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9725485Z Runs multiple iterations on _test_accumulate_gradients_no_sync using allreduce ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T01:46:55.9725814Z 2022-11-23T01:46:55.9726075Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9726380Z Ran 1 test in 0.003s 2022-11-23T01:46:55.9726539Z 2022-11-23T01:46:55.9726644Z OK (skipped=1) 2022-11-23T01:46:55.9726799Z 2022-11-23T01:46:55.9726921Z Generating XML reports... 2022-11-23T01:46:55.9727520Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012347.xml 2022-11-23T01:46:55.9728226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9728676Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9729309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9729779Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9730012Z 2022-11-23T01:46:55.9730122Z Running tests... 2022-11-23T01:46:55.9730532Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9731059Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9731525Z test_accumulate_gradients_no_sync_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T01:46:55.9732047Z Runs _test_accumulate_gradients_no_sync using default inputs ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T01:46:55.9732408Z 2022-11-23T01:46:55.9732676Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9733003Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9733149Z 2022-11-23T01:46:55.9733259Z OK (skipped=1) 2022-11-23T01:46:55.9733412Z 2022-11-23T01:46:55.9733533Z Generating XML reports... 2022-11-23T01:46:55.9734134Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012350.xml 2022-11-23T01:46:55.9734837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9735291Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9735865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9736342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9736557Z 2022-11-23T01:46:55.9736664Z Running tests... 2022-11-23T01:46:55.9737071Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9737604Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9738082Z test_all_gather (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9738562Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2698 2022-11-23T01:46:55.9739014Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2699 2022-11-23T01:46:55.9739624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9740063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9740639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9741119Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9741688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9742149Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9742726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9743194Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9743632Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9744138Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9744797Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9745494Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9746005Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9746642Z STAGE:2022-11-23 01:23:56 2698:2698 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9747134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9747693Z STAGE:2022-11-23 01:23:56 2699:2699 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9748220Z [1669166636.627917] [08317a7e7676:2699 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9748801Z [1669166638.293440] [08317a7e7676:2699 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9749524Z [1669166638.293440] [08317a7e7676:2699 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9750106Z STAGE:2022-11-23 01:23:58 2699:2699 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9750642Z [1669166636.607724] [08317a7e7676:2698 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9751145Z [1669166638.254534] [08317a7e7676:2698 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9751615Z [1669166638.254534] [08317a7e7676:2698 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9752180Z STAGE:2022-11-23 01:23:58 2698:2698 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9752783Z STAGE:2022-11-23 01:23:58 2699:2699 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9753386Z STAGE:2022-11-23 01:23:58 2698:2698 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9753964Z STAGE:2022-11-23 01:23:58 2698:2698 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9754520Z STAGE:2022-11-23 01:23:58 2699:2699 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9755090Z STAGE:2022-11-23 01:23:58 2698:2698 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9755661Z STAGE:2022-11-23 01:23:58 2699:2699 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9756256Z STAGE:2022-11-23 01:23:58 2698:2698 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9756833Z STAGE:2022-11-23 01:23:58 2699:2699 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9757191Z ok (6.653s) 2022-11-23T01:46:55.9757343Z 2022-11-23T01:46:55.9757612Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9757921Z Ran 1 test in 6.653s 2022-11-23T01:46:55.9758085Z 2022-11-23T01:46:55.9758179Z OK 2022-11-23T01:46:55.9758316Z 2022-11-23T01:46:55.9758438Z Generating XML reports... 2022-11-23T01:46:55.9759050Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012352.xml 2022-11-23T01:46:55.9759763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9760221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9760801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9761260Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9761491Z 2022-11-23T01:46:55.9761598Z Running tests... 2022-11-23T01:46:55.9762004Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9762535Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9763156Z test_all_gather_coalesced_complex (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T01:46:55.9763495Z 2022-11-23T01:46:55.9763763Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9764092Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9764255Z 2022-11-23T01:46:55.9764344Z OK (skipped=1) 2022-11-23T01:46:55.9764502Z 2022-11-23T01:46:55.9764626Z Generating XML reports... 2022-11-23T01:46:55.9765229Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012401.xml 2022-11-23T01:46:55.9765951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9766463Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9767046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9767532Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9767770Z 2022-11-23T01:46:55.9767880Z Running tests... 2022-11-23T01:46:55.9768266Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9768581Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9768877Z test_all_gather_coalesced_full_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T01:46:55.9768897Z 2022-11-23T01:46:55.9769158Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9769270Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9769289Z 2022-11-23T01:46:55.9769399Z OK (skipped=1) 2022-11-23T01:46:55.9769418Z 2022-11-23T01:46:55.9769541Z Generating XML reports... 2022-11-23T01:46:55.9769997Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012404.xml 2022-11-23T01:46:55.9770378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9770556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9770939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9771130Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9771150Z 2022-11-23T01:46:55.9771256Z Running tests... 2022-11-23T01:46:55.9771521Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9771834Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9772112Z test_all_gather_coalesced_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T01:46:55.9772149Z 2022-11-23T01:46:55.9772393Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9772506Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9772526Z 2022-11-23T01:46:55.9772630Z OK (skipped=1) 2022-11-23T01:46:55.9772649Z 2022-11-23T01:46:55.9772770Z Generating XML reports... 2022-11-23T01:46:55.9773215Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012406.xml 2022-11-23T01:46:55.9773588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9773764Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9774151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9774329Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9774367Z 2022-11-23T01:46:55.9774457Z Running tests... 2022-11-23T01:46:55.9774770Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9775090Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9775386Z test_all_gather_coalesced_simple (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T01:46:55.9775407Z 2022-11-23T01:46:55.9775667Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9775781Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9775801Z 2022-11-23T01:46:55.9775906Z OK (skipped=1) 2022-11-23T01:46:55.9775925Z 2022-11-23T01:46:55.9776048Z Generating XML reports... 2022-11-23T01:46:55.9776472Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012409.xml 2022-11-23T01:46:55.9776897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9777075Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9777461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9777649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9777669Z 2022-11-23T01:46:55.9777772Z Running tests... 2022-11-23T01:46:55.9778033Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9778344Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9778638Z test_all_gather_coalesced_with_empty (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.003s) 2022-11-23T01:46:55.9778661Z 2022-11-23T01:46:55.9778906Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9779017Z Ran 1 test in 0.003s 2022-11-23T01:46:55.9779037Z 2022-11-23T01:46:55.9779142Z OK (skipped=1) 2022-11-23T01:46:55.9779161Z 2022-11-23T01:46:55.9779287Z Generating XML reports... 2022-11-23T01:46:55.9779737Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012411.xml 2022-11-23T01:46:55.9780110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9780284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9780667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9780859Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9780883Z 2022-11-23T01:46:55.9780973Z Running tests... 2022-11-23T01:46:55.9781235Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9781547Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9781811Z test_all_gather_complex (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9782030Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 2977 2022-11-23T01:46:55.9782249Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 2978 2022-11-23T01:46:55.9782620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9782795Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9783158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9783355Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9783721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9783894Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9784320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9784514Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9784767Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9785016Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9785421Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9785806Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9786089Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9786429Z STAGE:2022-11-23 01:24:17 2978:2978 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9786659Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9786984Z STAGE:2022-11-23 01:24:17 2977:2977 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9787263Z [1669166657.825913] [08317a7e7676:2978 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9787498Z [1669166659.475872] [08317a7e7676:2978 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9787735Z [1669166659.475872] [08317a7e7676:2978 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9788011Z [1669166657.802728] [08317a7e7676:2977 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9788227Z [1669166659.461264] [08317a7e7676:2977 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9788462Z [1669166659.461264] [08317a7e7676:2977 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9789234Z STAGE:2022-11-23 01:24:19 2978:2978 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:24:19 2977:2977 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9789259Z 2022-11-23T01:46:55.9789623Z STAGE:2022-11-23 01:24:19 2978:2978 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9789969Z STAGE:2022-11-23 01:24:19 2977:2977 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9790306Z STAGE:2022-11-23 01:24:19 2978:2978 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9790637Z STAGE:2022-11-23 01:24:19 2977:2977 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9790969Z STAGE:2022-11-23 01:24:19 2978:2978 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9791304Z STAGE:2022-11-23 01:24:19 2977:2977 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9791649Z STAGE:2022-11-23 01:24:19 2978:2978 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9791976Z STAGE:2022-11-23 01:24:19 2977:2977 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9792077Z ok (6.653s) 2022-11-23T01:46:55.9792097Z 2022-11-23T01:46:55.9792361Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9792475Z Ran 1 test in 6.653s 2022-11-23T01:46:55.9792495Z 2022-11-23T01:46:55.9792584Z OK 2022-11-23T01:46:55.9792603Z 2022-11-23T01:46:55.9792724Z Generating XML reports... 2022-11-23T01:46:55.9793249Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012413.xml 2022-11-23T01:46:55.9793638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9793815Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9794181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9794375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9794395Z 2022-11-23T01:46:55.9794503Z Running tests... 2022-11-23T01:46:55.9794768Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9795146Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9795418Z test_all_gather_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all gather (0.002s) 2022-11-23T01:46:55.9795438Z 2022-11-23T01:46:55.9795701Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9795813Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9795832Z 2022-11-23T01:46:55.9795921Z OK (skipped=1) 2022-11-23T01:46:55.9795958Z 2022-11-23T01:46:55.9796063Z Generating XML reports... 2022-11-23T01:46:55.9796512Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012423.xml 2022-11-23T01:46:55.9796887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9797062Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9797447Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9797641Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9797660Z 2022-11-23T01:46:55.9797767Z Running tests... 2022-11-23T01:46:55.9798033Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9798332Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9798611Z test_all_gather_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all gather (0.002s) 2022-11-23T01:46:55.9798631Z 2022-11-23T01:46:55.9798892Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9799001Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9799021Z 2022-11-23T01:46:55.9799126Z OK (skipped=1) 2022-11-23T01:46:55.9799145Z 2022-11-23T01:46:55.9799266Z Generating XML reports... 2022-11-23T01:46:55.9799717Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012425.xml 2022-11-23T01:46:55.9800091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9800270Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9800637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9800829Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9800849Z 2022-11-23T01:46:55.9800954Z Running tests... 2022-11-23T01:46:55.9801217Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9801529Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9801792Z test_all_gather_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9802016Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3157 2022-11-23T01:46:55.9802234Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3158 2022-11-23T01:46:55.9802636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9802819Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9803205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9803398Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9803765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9803938Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9804310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9804553Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9804810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9805039Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9805449Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9805850Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9806080Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9806322Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:55.9806557Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9806807Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:55.9807214Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:55.9807545Z STAGE:2022-11-23 01:24:31 3157:3157 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9807926Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:55.9808252Z STAGE:2022-11-23 01:24:31 3158:3158 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9808528Z [1669166671.807732] [08317a7e7676:3158 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9808762Z [1669166673.452758] [08317a7e7676:3158 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9809000Z [1669166673.452758] [08317a7e7676:3158 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9809276Z [1669166671.786581] [08317a7e7676:3157 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9809503Z [1669166673.460535] [08317a7e7676:3157 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9809742Z [1669166673.460535] [08317a7e7676:3157 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9810295Z STAGE:2022-11-23 01:24:33 3158:3158 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:24:33 3157:3157 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9810320Z 2022-11-23T01:46:55.9810670Z STAGE:2022-11-23 01:24:33 3158:3158 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9811017Z STAGE:2022-11-23 01:24:33 3157:3157 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9811421Z STAGE:2022-11-23 01:24:33 3158:3158 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9811759Z STAGE:2022-11-23 01:24:33 3157:3157 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9812093Z STAGE:2022-11-23 01:24:33 3158:3158 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9812430Z STAGE:2022-11-23 01:24:33 3157:3157 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9812774Z STAGE:2022-11-23 01:24:33 3158:3158 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9813113Z STAGE:2022-11-23 01:24:33 3157:3157 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9813265Z ok (6.654s) 2022-11-23T01:46:55.9813285Z 2022-11-23T01:46:55.9813554Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9813648Z Ran 1 test in 6.654s 2022-11-23T01:46:55.9813687Z 2022-11-23T01:46:55.9813764Z OK 2022-11-23T01:46:55.9813783Z 2022-11-23T01:46:55.9813910Z Generating XML reports... 2022-11-23T01:46:55.9814359Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012427.xml 2022-11-23T01:46:55.9814731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9814907Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9815294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9815486Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9815510Z 2022-11-23T01:46:55.9815617Z Running tests... 2022-11-23T01:46:55.9815867Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9816184Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9816443Z test_all_gather_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9816662Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3271 2022-11-23T01:46:55.9816879Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3272 2022-11-23T01:46:55.9817255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9817429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9817811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9817990Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9818362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9818537Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9818915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9819103Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9819352Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9819598Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9820001Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9820406Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9820622Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9820899Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9821068Z skip: Skipped due to small world size. (4.203s) 2022-11-23T01:46:55.9821088Z 2022-11-23T01:46:55.9821359Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9821472Z Ran 1 test in 4.203s 2022-11-23T01:46:55.9821492Z 2022-11-23T01:46:55.9821600Z OK (skipped=1) 2022-11-23T01:46:55.9821619Z 2022-11-23T01:46:55.9821745Z Generating XML reports... 2022-11-23T01:46:55.9822198Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012436.xml 2022-11-23T01:46:55.9822627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9822786Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9823172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9823366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9823385Z 2022-11-23T01:46:55.9823494Z Running tests... 2022-11-23T01:46:55.9823758Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9824072Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9824373Z test_all_gather_into_cat_tensor_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_gather_into_tensor (0.002s) 2022-11-23T01:46:55.9824394Z 2022-11-23T01:46:55.9824656Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9824770Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9824790Z 2022-11-23T01:46:55.9824879Z OK (skipped=1) 2022-11-23T01:46:55.9824898Z 2022-11-23T01:46:55.9825019Z Generating XML reports... 2022-11-23T01:46:55.9825470Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012443.xml 2022-11-23T01:46:55.9825846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9826021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9826402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9826595Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9826615Z 2022-11-23T01:46:55.9826721Z Running tests... 2022-11-23T01:46:55.9826965Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9827285Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9827589Z test_all_gather_into_stack_tensor_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_gather_into_tensor (0.002s) 2022-11-23T01:46:55.9827612Z 2022-11-23T01:46:55.9827868Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9827978Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9827997Z 2022-11-23T01:46:55.9828103Z OK (skipped=1) 2022-11-23T01:46:55.9828122Z 2022-11-23T01:46:55.9828244Z Generating XML reports... 2022-11-23T01:46:55.9828692Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012446.xml 2022-11-23T01:46:55.9829283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9829450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9829847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9830042Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9830062Z 2022-11-23T01:46:55.9830255Z Running tests... 2022-11-23T01:46:55.9830530Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9830846Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9831138Z test_all_gather_multigpu (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl backend supports allgather multigpu (0.002s) 2022-11-23T01:46:55.9831158Z 2022-11-23T01:46:55.9831418Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9831527Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9831546Z 2022-11-23T01:46:55.9831634Z OK (skipped=1) 2022-11-23T01:46:55.9831653Z 2022-11-23T01:46:55.9831838Z Generating XML reports... 2022-11-23T01:46:55.9832291Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012448.xml 2022-11-23T01:46:55.9832668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9832847Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9833231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9833423Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9833443Z 2022-11-23T01:46:55.9833549Z Running tests... 2022-11-23T01:46:55.9833811Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9834108Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9834410Z test_all_gather_multigpu_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl backend supports allgather multigpu (0.002s) 2022-11-23T01:46:55.9834434Z 2022-11-23T01:46:55.9834693Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9834804Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9834827Z 2022-11-23T01:46:55.9834935Z OK (skipped=1) 2022-11-23T01:46:55.9834955Z 2022-11-23T01:46:55.9835078Z Generating XML reports... 2022-11-23T01:46:55.9835523Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012450.xml 2022-11-23T01:46:55.9835895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9836053Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9836434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9836628Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9836647Z 2022-11-23T01:46:55.9836751Z Running tests... 2022-11-23T01:46:55.9837011Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9837324Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9837602Z test_all_gather_object_default_pg (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9837820Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3506 2022-11-23T01:46:55.9838039Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3507 2022-11-23T01:46:55.9838398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9838574Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9838956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9839151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9839562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9839742Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9840127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9840321Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9840554Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9840804Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9841209Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9841664Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9841903Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9842137Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9842412Z [1669166697.077523] [08317a7e7676:3506 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9842646Z [1669166698.497536] [08317a7e7676:3506 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9842884Z [1669166698.497536] [08317a7e7676:3506 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9843156Z [1669166697.100053] [08317a7e7676:3507 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9843370Z [1669166698.494628] [08317a7e7676:3507 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9843609Z [1669166698.494628] [08317a7e7676:3507 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9843710Z ok (7.233s) 2022-11-23T01:46:55.9843730Z 2022-11-23T01:46:55.9843995Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9844105Z Ran 1 test in 7.233s 2022-11-23T01:46:55.9844124Z 2022-11-23T01:46:55.9844215Z OK 2022-11-23T01:46:55.9844234Z 2022-11-23T01:46:55.9844355Z Generating XML reports... 2022-11-23T01:46:55.9844804Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012453.xml 2022-11-23T01:46:55.9845187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9845346Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9845736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9845928Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9845947Z 2022-11-23T01:46:55.9846053Z Running tests... 2022-11-23T01:46:55.9846314Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9846626Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9846902Z test_all_gather_object_subgroup (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9847122Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 3617 2022-11-23T01:46:55.9847325Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 3618 2022-11-23T01:46:55.9847699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9847874Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9848306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9848508Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9848877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9849051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9849431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9849625Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9849906Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9850157Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9850568Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9850968Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9851202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9851436Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9851678Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:55.9851925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:55.9852330Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:55.9852712Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:55.9852955Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T01:46:55.9853194Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T01:46:55.9853585Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:46:55.9853983Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:46:55.9854225Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T01:46:55.9854467Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T01:46:55.9854863Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T01:46:55.9855259Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T01:46:55.9855534Z [1669166706.903947] [08317a7e7676:3618 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9855749Z [1669166708.301387] [08317a7e7676:3618 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9855989Z [1669166708.301387] [08317a7e7676:3618 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9856264Z [1669166706.880208] [08317a7e7676:3617 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9856492Z [1669166708.276892] [08317a7e7676:3617 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9856825Z [1669166708.276892] [08317a7e7676:3617 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9856937Z ok (7.639s) 2022-11-23T01:46:55.9856957Z 2022-11-23T01:46:55.9857230Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9857346Z Ran 1 test in 7.639s 2022-11-23T01:46:55.9857365Z 2022-11-23T01:46:55.9857459Z OK 2022-11-23T01:46:55.9857478Z 2022-11-23T01:46:55.9857583Z Generating XML reports... 2022-11-23T01:46:55.9858037Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012502.xml 2022-11-23T01:46:55.9858466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9858647Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9859036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9859231Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9859251Z 2022-11-23T01:46:55.9859358Z Running tests... 2022-11-23T01:46:55.9859623Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9859921Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9860184Z test_all_gather_v_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports all_gather_v (0.003s) 2022-11-23T01:46:55.9860204Z 2022-11-23T01:46:55.9860464Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9860579Z Ran 1 test in 0.003s 2022-11-23T01:46:55.9860600Z 2022-11-23T01:46:55.9860705Z OK (skipped=1) 2022-11-23T01:46:55.9860725Z 2022-11-23T01:46:55.9860847Z Generating XML reports... 2022-11-23T01:46:55.9861297Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012513.xml 2022-11-23T01:46:55.9861674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9861851Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9862218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9862411Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9862431Z 2022-11-23T01:46:55.9862539Z Running tests... 2022-11-23T01:46:55.9862801Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9863118Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9863538Z test_all_reduce_coalesced_full_group_max (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:55.9863561Z 2022-11-23T01:46:55.9863819Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9863929Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9863948Z 2022-11-23T01:46:55.9864054Z OK (skipped=1) 2022-11-23T01:46:55.9864073Z 2022-11-23T01:46:55.9864177Z Generating XML reports... 2022-11-23T01:46:55.9864624Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012515.xml 2022-11-23T01:46:55.9864998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9865174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9865560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9865752Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9865771Z 2022-11-23T01:46:55.9865928Z Running tests... 2022-11-23T01:46:55.9866198Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9866515Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9866920Z test_all_reduce_coalesced_full_group_min (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:55.9866940Z 2022-11-23T01:46:55.9867197Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9867307Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9867327Z 2022-11-23T01:46:55.9867433Z OK (skipped=1) 2022-11-23T01:46:55.9867452Z 2022-11-23T01:46:55.9867633Z Generating XML reports... 2022-11-23T01:46:55.9868085Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012517.xml 2022-11-23T01:46:55.9868466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9868643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9869352Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9869536Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9869557Z 2022-11-23T01:46:55.9869668Z Running tests... 2022-11-23T01:46:55.9869934Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9870251Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9870682Z test_all_reduce_coalesced_full_group_product (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:55.9870707Z 2022-11-23T01:46:55.9870967Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9871078Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9871101Z 2022-11-23T01:46:55.9871206Z OK (skipped=1) 2022-11-23T01:46:55.9871225Z 2022-11-23T01:46:55.9871345Z Generating XML reports... 2022-11-23T01:46:55.9871776Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012520.xml 2022-11-23T01:46:55.9872150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9872324Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9872707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9872904Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9872923Z 2022-11-23T01:46:55.9873029Z Running tests... 2022-11-23T01:46:55.9873290Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9873604Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9874010Z test_all_reduce_coalesced_full_group_sum (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:55.9874047Z 2022-11-23T01:46:55.9874288Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9874397Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9874417Z 2022-11-23T01:46:55.9874522Z OK (skipped=1) 2022-11-23T01:46:55.9874541Z 2022-11-23T01:46:55.9874663Z Generating XML reports... 2022-11-23T01:46:55.9875109Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012522.xml 2022-11-23T01:46:55.9875484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9875660Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9876119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9876305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9876343Z 2022-11-23T01:46:55.9876433Z Running tests... 2022-11-23T01:46:55.9876696Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9877011Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9877427Z test_all_reduce_coalesced_group_max (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:55.9877447Z 2022-11-23T01:46:55.9877775Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9877889Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9877908Z 2022-11-23T01:46:55.9878017Z OK (skipped=1) 2022-11-23T01:46:55.9878036Z 2022-11-23T01:46:55.9878161Z Generating XML reports... 2022-11-23T01:46:55.9878594Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012524.xml 2022-11-23T01:46:55.9878971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9879146Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9879525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9879717Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9879737Z 2022-11-23T01:46:55.9879843Z Running tests... 2022-11-23T01:46:55.9880111Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9880426Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9880840Z test_all_reduce_coalesced_group_min (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:55.9880861Z 2022-11-23T01:46:55.9881105Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9881214Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9881234Z 2022-11-23T01:46:55.9881339Z OK (skipped=1) 2022-11-23T01:46:55.9881357Z 2022-11-23T01:46:55.9881479Z Generating XML reports... 2022-11-23T01:46:55.9881924Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012527.xml 2022-11-23T01:46:55.9882299Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9882479Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9882861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9883055Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9883075Z 2022-11-23T01:46:55.9883165Z Running tests... 2022-11-23T01:46:55.9883426Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9883737Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9884160Z test_all_reduce_coalesced_group_product (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:55.9884180Z 2022-11-23T01:46:55.9884436Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9884546Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9884566Z 2022-11-23T01:46:55.9884675Z OK (skipped=1) 2022-11-23T01:46:55.9884693Z 2022-11-23T01:46:55.9884814Z Generating XML reports... 2022-11-23T01:46:55.9885243Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012529.xml 2022-11-23T01:46:55.9885669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9885853Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9886242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9886440Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9886460Z 2022-11-23T01:46:55.9886570Z Running tests... 2022-11-23T01:46:55.9886834Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9887148Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9887611Z test_all_reduce_coalesced_group_sum (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:55.9887632Z 2022-11-23T01:46:55.9887872Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9887988Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9888008Z 2022-11-23T01:46:55.9888116Z OK (skipped=1) 2022-11-23T01:46:55.9888135Z 2022-11-23T01:46:55.9888259Z Generating XML reports... 2022-11-23T01:46:55.9888702Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012532.xml 2022-11-23T01:46:55.9889076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9889252Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9889632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9889827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9889847Z 2022-11-23T01:46:55.9889937Z Running tests... 2022-11-23T01:46:55.9890201Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9890515Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9890912Z test_all_reduce_coalesced_max (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:55.9890932Z 2022-11-23T01:46:55.9891189Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9891299Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9891318Z 2022-11-23T01:46:55.9891424Z OK (skipped=1) 2022-11-23T01:46:55.9891442Z 2022-11-23T01:46:55.9891564Z Generating XML reports... 2022-11-23T01:46:55.9892006Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012534.xml 2022-11-23T01:46:55.9892371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9892546Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9892928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9893120Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9893140Z 2022-11-23T01:46:55.9893245Z Running tests... 2022-11-23T01:46:55.9893508Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9893820Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9894120Z test_all_reduce_coalesced_max_complex_unsupported (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9894327Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4079 2022-11-23T01:46:55.9894547Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4080 2022-11-23T01:46:55.9894984Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9895170Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9895558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9895754Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9896124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9896301Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9896679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9896900Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9897155Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9897407Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9897812Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9898214Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9898451Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9899201Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T01:46:55.9899318Z warnings.warn( 2022-11-23T01:46:55.9899549Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9900292Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T01:46:55.9900385Z warnings.warn( 2022-11-23T01:46:55.9900484Z ok (4.358s) 2022-11-23T01:46:55.9900504Z 2022-11-23T01:46:55.9900767Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9900878Z Ran 1 test in 4.358s 2022-11-23T01:46:55.9900897Z 2022-11-23T01:46:55.9900986Z OK 2022-11-23T01:46:55.9901005Z 2022-11-23T01:46:55.9901127Z Generating XML reports... 2022-11-23T01:46:55.9901581Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012536.xml 2022-11-23T01:46:55.9901956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9902118Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9902502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9902695Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9902714Z 2022-11-23T01:46:55.9902820Z Running tests... 2022-11-23T01:46:55.9903086Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9903399Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9903798Z test_all_reduce_coalesced_min (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:55.9903822Z 2022-11-23T01:46:55.9904084Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9904193Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9904213Z 2022-11-23T01:46:55.9904349Z OK (skipped=1) 2022-11-23T01:46:55.9904370Z 2022-11-23T01:46:55.9904496Z Generating XML reports... 2022-11-23T01:46:55.9904945Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012543.xml 2022-11-23T01:46:55.9905321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9905499Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9905881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9906078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9906140Z 2022-11-23T01:46:55.9906254Z Running tests... 2022-11-23T01:46:55.9906503Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9906821Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9907232Z test_all_reduce_coalesced_product (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:55.9907251Z 2022-11-23T01:46:55.9907512Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9907621Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9907641Z 2022-11-23T01:46:55.9907745Z OK (skipped=1) 2022-11-23T01:46:55.9907765Z 2022-11-23T01:46:55.9907886Z Generating XML reports... 2022-11-23T01:46:55.9908335Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012546.xml 2022-11-23T01:46:55.9908713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9908872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9909489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9909686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9909707Z 2022-11-23T01:46:55.9909817Z Running tests... 2022-11-23T01:46:55.9910082Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9910396Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9910797Z test_all_reduce_coalesced_sum (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:55.9910817Z 2022-11-23T01:46:55.9911076Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9911224Z Ran 1 test in 0.002s 2022-11-23T01:46:55.9911244Z 2022-11-23T01:46:55.9911335Z OK (skipped=1) 2022-11-23T01:46:55.9911354Z 2022-11-23T01:46:55.9911479Z Generating XML reports... 2022-11-23T01:46:55.9911925Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012548.xml 2022-11-23T01:46:55.9912296Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9912472Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9912851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9913042Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9913061Z 2022-11-23T01:46:55.9913168Z Running tests... 2022-11-23T01:46:55.9913429Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9913728Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9914018Z test_all_reduce_complex_unsupported_ops (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9914310Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4281 2022-11-23T01:46:55.9914545Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4282 2022-11-23T01:46:55.9914918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9915097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9915482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9915679Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9916097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9916285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9916668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9916860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9917108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9917358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9917761Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9918161Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9918396Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9918612Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9918713Z ok (4.323s) 2022-11-23T01:46:55.9918732Z 2022-11-23T01:46:55.9919001Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9919112Z Ran 1 test in 4.323s 2022-11-23T01:46:55.9919131Z 2022-11-23T01:46:55.9919224Z OK 2022-11-23T01:46:55.9919243Z 2022-11-23T01:46:55.9919367Z Generating XML reports... 2022-11-23T01:46:55.9919815Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012550.xml 2022-11-23T01:46:55.9920188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9920364Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9920732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9920924Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9920943Z 2022-11-23T01:46:55.9921049Z Running tests... 2022-11-23T01:46:55.9921315Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9921632Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9921911Z test_all_reduce_full_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9922131Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4384 2022-11-23T01:46:55.9922353Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4385 2022-11-23T01:46:55.9922706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9922886Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9923265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9923503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9923880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9924057Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9924435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9924628Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9924858Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9925105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9925558Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9925959Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9926195Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9926441Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:55.9926670Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9926910Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:55.9927312Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:55.9927708Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:55.9928024Z STAGE:2022-11-23 01:26:01 4384:4384 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9928354Z STAGE:2022-11-23 01:26:01 4385:4385 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9928628Z [1669166761.602023] [08317a7e7676:4384 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9928855Z [1669166763.245310] [08317a7e7676:4384 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9929092Z [1669166763.245310] [08317a7e7676:4384 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9929359Z [1669166761.622699] [08317a7e7676:4385 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9929587Z [1669166763.272797] [08317a7e7676:4385 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9929823Z [1669166763.272797] [08317a7e7676:4385 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9930379Z STAGE:2022-11-23 01:26:03 4384:4384 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:26:03 4385:4385 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9930401Z 2022-11-23T01:46:55.9930748Z STAGE:2022-11-23 01:26:03 4385:4385 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9931077Z STAGE:2022-11-23 01:26:03 4384:4384 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9931398Z STAGE:2022-11-23 01:26:03 4384:4384 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9931721Z STAGE:2022-11-23 01:26:03 4385:4385 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9932047Z STAGE:2022-11-23 01:26:03 4384:4384 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9932412Z STAGE:2022-11-23 01:26:03 4385:4385 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9932760Z STAGE:2022-11-23 01:26:03 4384:4384 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9933093Z STAGE:2022-11-23 01:26:03 4385:4385 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9933185Z ok (6.615s) 2022-11-23T01:46:55.9933204Z 2022-11-23T01:46:55.9933467Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9933561Z Ran 1 test in 6.615s 2022-11-23T01:46:55.9933580Z 2022-11-23T01:46:55.9933669Z OK 2022-11-23T01:46:55.9933688Z 2022-11-23T01:46:55.9933855Z Generating XML reports... 2022-11-23T01:46:55.9934307Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012557.xml 2022-11-23T01:46:55.9934685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9934862Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9935245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9935433Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9935453Z 2022-11-23T01:46:55.9935556Z Running tests... 2022-11-23T01:46:55.9935802Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9936105Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9936370Z test_all_reduce_full_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9936588Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4498 2022-11-23T01:46:55.9936803Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4499 2022-11-23T01:46:55.9937177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9937346Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9937722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9937897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9938253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9938422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9938799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9938984Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9939224Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9939462Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9939853Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9940241Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9940458Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9940698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:55.9940918Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9941145Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:55.9941586Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:55.9941978Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:55.9942301Z STAGE:2022-11-23 01:26:10 4499:4499 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9942621Z STAGE:2022-11-23 01:26:10 4498:4498 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9942885Z [1669166770.724258] [08317a7e7676:4499 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9943147Z [1669166772.334379] [08317a7e7676:4499 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9943379Z [1669166772.334379] [08317a7e7676:4499 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9943641Z [1669166770.703550] [08317a7e7676:4498 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9943860Z [1669166772.339349] [08317a7e7676:4498 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9944086Z [1669166772.339349] [08317a7e7676:4498 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9944629Z STAGE:2022-11-23 01:26:12 4499:4499 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:26:12 4498:4498 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9944653Z 2022-11-23T01:46:55.9944993Z STAGE:2022-11-23 01:26:12 4499:4499 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9945332Z STAGE:2022-11-23 01:26:12 4498:4498 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9945646Z STAGE:2022-11-23 01:26:12 4499:4499 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9945962Z STAGE:2022-11-23 01:26:12 4498:4498 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9946274Z STAGE:2022-11-23 01:26:12 4499:4499 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9946596Z STAGE:2022-11-23 01:26:12 4498:4498 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9946938Z STAGE:2022-11-23 01:26:12 4499:4499 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9947276Z STAGE:2022-11-23 01:26:12 4498:4498 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9947375Z ok (6.624s) 2022-11-23T01:46:55.9947394Z 2022-11-23T01:46:55.9947658Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9947768Z Ran 1 test in 6.624s 2022-11-23T01:46:55.9947791Z 2022-11-23T01:46:55.9947876Z OK 2022-11-23T01:46:55.9947895Z 2022-11-23T01:46:55.9948007Z Generating XML reports... 2022-11-23T01:46:55.9948441Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012606.xml 2022-11-23T01:46:55.9948809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9949199Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9949601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9949794Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9949815Z 2022-11-23T01:46:55.9949919Z Running tests... 2022-11-23T01:46:55.9950182Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9950578Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9950854Z test_all_reduce_full_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9951067Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4612 2022-11-23T01:46:55.9951276Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4613 2022-11-23T01:46:55.9951639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9951806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9952181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9952429Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9952803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9952978Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9953346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9953536Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9953787Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9954034Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9954435Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9954837Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9955072Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9955317Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:55.9955534Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9955756Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:55.9956149Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:55.9956472Z STAGE:2022-11-23 01:26:19 4612:4612 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9956857Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:55.9957180Z STAGE:2022-11-23 01:26:19 4613:4613 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9957448Z [1669166779.988661] [08317a7e7676:4613 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9957669Z [1669166781.606986] [08317a7e7676:4613 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9957896Z [1669166781.606986] [08317a7e7676:4613 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9958159Z [1669166779.967376] [08317a7e7676:4612 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9958379Z [1669166781.605028] [08317a7e7676:4612 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9958601Z [1669166781.605028] [08317a7e7676:4612 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9959200Z STAGE:2022-11-23 01:26:21 4613:4613 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:26:21 4612:4612 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9959223Z 2022-11-23T01:46:55.9959569Z STAGE:2022-11-23 01:26:21 4613:4613 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9959905Z STAGE:2022-11-23 01:26:21 4612:4612 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9960219Z STAGE:2022-11-23 01:26:22 4612:4612 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9960533Z STAGE:2022-11-23 01:26:22 4613:4613 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9960911Z STAGE:2022-11-23 01:26:22 4612:4612 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9961227Z STAGE:2022-11-23 01:26:22 4613:4613 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9961565Z STAGE:2022-11-23 01:26:22 4612:4612 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9961893Z STAGE:2022-11-23 01:26:22 4613:4613 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9961979Z ok (6.555s) 2022-11-23T01:46:55.9961998Z 2022-11-23T01:46:55.9962252Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9962353Z Ran 1 test in 6.556s 2022-11-23T01:46:55.9962373Z 2022-11-23T01:46:55.9962454Z OK 2022-11-23T01:46:55.9962473Z 2022-11-23T01:46:55.9962586Z Generating XML reports... 2022-11-23T01:46:55.9963028Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012616.xml 2022-11-23T01:46:55.9963400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9963566Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9963936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9964118Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9964138Z 2022-11-23T01:46:55.9964236Z Running tests... 2022-11-23T01:46:55.9964490Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9964794Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9965055Z test_all_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9965263Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4726 2022-11-23T01:46:55.9965476Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4727 2022-11-23T01:46:55.9965831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9966002Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9966369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9966550Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9966903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9967065Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9967438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9967621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9967860Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9968138Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9968539Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9968928Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9969151Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9969386Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:55.9969603Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9969879Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:55.9970270Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:55.9970659Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:55.9970974Z STAGE:2022-11-23 01:26:29 4727:4727 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9971299Z STAGE:2022-11-23 01:26:29 4726:4726 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9971563Z [1669166789.147829] [08317a7e7676:4727 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9971786Z [1669166790.789865] [08317a7e7676:4727 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9972020Z [1669166790.789865] [08317a7e7676:4727 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9972284Z [1669166789.126854] [08317a7e7676:4726 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:55.9972505Z [1669166790.860494] [08317a7e7676:4726 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:55.9972732Z [1669166790.860494] [08317a7e7676:4726 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:55.9973276Z STAGE:2022-11-23 01:26:31 4727:4727 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:26:31 4726:4726 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9973297Z 2022-11-23T01:46:55.9973638Z STAGE:2022-11-23 01:26:31 4726:4726 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9973980Z STAGE:2022-11-23 01:26:31 4727:4727 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9974291Z STAGE:2022-11-23 01:26:31 4727:4727 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9974606Z STAGE:2022-11-23 01:26:31 4726:4726 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:55.9974929Z STAGE:2022-11-23 01:26:31 4727:4727 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9975243Z STAGE:2022-11-23 01:26:31 4726:4726 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:55.9975575Z STAGE:2022-11-23 01:26:31 4727:4727 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9975901Z STAGE:2022-11-23 01:26:31 4726:4726 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:55.9975993Z ok (6.760s) 2022-11-23T01:46:55.9976016Z 2022-11-23T01:46:55.9976273Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9976368Z Ran 1 test in 6.760s 2022-11-23T01:46:55.9976394Z 2022-11-23T01:46:55.9976468Z OK 2022-11-23T01:46:55.9976487Z 2022-11-23T01:46:55.9976648Z Generating XML reports... 2022-11-23T01:46:55.9977098Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012625.xml 2022-11-23T01:46:55.9977464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9977634Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9978015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9978206Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9978226Z 2022-11-23T01:46:55.9978383Z Running tests... 2022-11-23T01:46:55.9978631Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9978948Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9979217Z test_all_reduce_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9979438Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4840 2022-11-23T01:46:55.9979645Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4841 2022-11-23T01:46:55.9980009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9980173Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9980543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9980722Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9981081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9981244Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9981623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9981811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9982048Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9982290Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9982686Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9983083Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9983303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9983537Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9983685Z skip: Skipped due to small world size. (4.226s) 2022-11-23T01:46:55.9983704Z 2022-11-23T01:46:55.9983969Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9984079Z Ran 1 test in 4.227s 2022-11-23T01:46:55.9984099Z 2022-11-23T01:46:55.9984194Z OK (skipped=1) 2022-11-23T01:46:55.9984213Z 2022-11-23T01:46:55.9984333Z Generating XML reports... 2022-11-23T01:46:55.9984778Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012634.xml 2022-11-23T01:46:55.9985144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9985308Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9985692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9985929Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9985950Z 2022-11-23T01:46:55.9986053Z Running tests... 2022-11-23T01:46:55.9986318Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9986629Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9986887Z test_all_reduce_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9987105Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 4943 2022-11-23T01:46:55.9987306Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 4944 2022-11-23T01:46:55.9987728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9987898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9988280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9988465Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9988827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9989210Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9989606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9989788Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9990019Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9990270Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9990669Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9991066Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9991297Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9991518Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9991672Z skip: Skipped due to small world size. (4.256s) 2022-11-23T01:46:55.9991692Z 2022-11-23T01:46:55.9991958Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9992060Z Ran 1 test in 4.256s 2022-11-23T01:46:55.9992083Z 2022-11-23T01:46:55.9992171Z OK (skipped=1) 2022-11-23T01:46:55.9992190Z 2022-11-23T01:46:55.9992311Z Generating XML reports... 2022-11-23T01:46:55.9992759Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012641.xml 2022-11-23T01:46:55.9993129Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9993307Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9993685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9993870Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9993890Z 2022-11-23T01:46:55.9993996Z Running tests... 2022-11-23T01:46:55.9994243Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9994558Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:55.9994826Z test_all_reduce_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:55.9995041Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5046 2022-11-23T01:46:55.9995325Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5047 2022-11-23T01:46:55.9995703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9995875Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9996244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9996434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9996783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:55.9997067Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:55.9997437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:55.9997629Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:55.9997878Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:55.9998116Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:55.9998514Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9998903Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:55.9999133Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:55.9999350Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:55.9999506Z skip: Skipped due to small world size. (4.206s) 2022-11-23T01:46:55.9999526Z 2022-11-23T01:46:55.9999785Z ---------------------------------------------------------------------- 2022-11-23T01:46:55.9999895Z Ran 1 test in 4.206s 2022-11-23T01:46:55.9999914Z 2022-11-23T01:46:56.0000022Z OK (skipped=1) 2022-11-23T01:46:56.0000042Z 2022-11-23T01:46:56.0000159Z Generating XML reports... 2022-11-23T01:46:56.0000613Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012648.xml 2022-11-23T01:46:56.0000986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0001145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0001516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0001709Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0001729Z 2022-11-23T01:46:56.0001835Z Running tests... 2022-11-23T01:46:56.0002089Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0002403Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0002667Z test_all_reduce_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0002874Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5149 2022-11-23T01:46:56.0003090Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5150 2022-11-23T01:46:56.0003449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0003622Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0003998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0004186Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0004601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0004770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0005152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0005340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0005569Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0005810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0006262Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0006658Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0006886Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0007116Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0007276Z skip: Skipped due to small world size. (4.359s) 2022-11-23T01:46:56.0007295Z 2022-11-23T01:46:56.0007550Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0007657Z Ran 1 test in 4.359s 2022-11-23T01:46:56.0007677Z 2022-11-23T01:46:56.0007765Z OK (skipped=1) 2022-11-23T01:46:56.0007784Z 2022-11-23T01:46:56.0007906Z Generating XML reports... 2022-11-23T01:46:56.0008345Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012654.xml 2022-11-23T01:46:56.0008722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0008901Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0009271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0009461Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0009481Z 2022-11-23T01:46:56.0009578Z Running tests... 2022-11-23T01:46:56.0009836Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0010135Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0010387Z test_all_reduce_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0010602Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5252 2022-11-23T01:46:56.0010815Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5253 2022-11-23T01:46:56.0011232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0011404Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0011787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0011978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0012331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0012496Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0012872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0013064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0013303Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0013594Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0014005Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0014398Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0014627Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0014943Z STAGE:2022-11-23 01:27:05 5253:5253 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0015162Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0015532Z STAGE:2022-11-23 01:27:05 5252:5252 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0015813Z [1669166825.676905] [08317a7e7676:5253 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0016045Z [1669166827.310350] [08317a7e7676:5253 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0016275Z [1669166827.310350] [08317a7e7676:5253 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0016546Z [1669166825.655302] [08317a7e7676:5252 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0016774Z [1669166827.355617] [08317a7e7676:5252 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0017005Z [1669166827.355617] [08317a7e7676:5252 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0017558Z STAGE:2022-11-23 01:27:07 5253:5253 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:27:07 5252:5252 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0017580Z 2022-11-23T01:46:56.0017931Z STAGE:2022-11-23 01:27:07 5253:5253 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0018259Z STAGE:2022-11-23 01:27:07 5252:5252 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0018577Z STAGE:2022-11-23 01:27:07 5253:5253 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0018898Z STAGE:2022-11-23 01:27:07 5252:5252 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0019226Z STAGE:2022-11-23 01:27:07 5253:5253 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0019775Z STAGE:2022-11-23 01:27:07 5252:5252 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:27:07 5253:5253 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0019799Z 2022-11-23T01:46:56.0020140Z STAGE:2022-11-23 01:27:07 5252:5252 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0020232Z ok (6.778s) 2022-11-23T01:46:56.0020251Z 2022-11-23T01:46:56.0020515Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0020624Z Ran 1 test in 6.778s 2022-11-23T01:46:56.0020644Z 2022-11-23T01:46:56.0020718Z OK 2022-11-23T01:46:56.0020737Z 2022-11-23T01:46:56.0020850Z Generating XML reports... 2022-11-23T01:46:56.0021296Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012701.xml 2022-11-23T01:46:56.0021667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0021841Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0022282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0022474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0022494Z 2022-11-23T01:46:56.0022600Z Running tests... 2022-11-23T01:46:56.0022850Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0023164Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0023412Z test_all_reduce_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0023630Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5366 2022-11-23T01:46:56.0023847Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5367 2022-11-23T01:46:56.0024269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0024444Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0024830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0025013Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0025366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0025538Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0025913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0026092Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0026341Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0026585Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0026985Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0027387Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0027615Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0027932Z STAGE:2022-11-23 01:27:14 5366:5366 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0028154Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0028483Z STAGE:2022-11-23 01:27:14 5367:5367 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0028758Z [1669166834.817296] [08317a7e7676:5366 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0029199Z [1669166836.452662] [08317a7e7676:5366 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0029440Z [1669166836.452662] [08317a7e7676:5366 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0029713Z [1669166834.839004] [08317a7e7676:5367 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0029942Z [1669166836.503904] [08317a7e7676:5367 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0030167Z [1669166836.503904] [08317a7e7676:5367 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0030724Z STAGE:2022-11-23 01:27:16 5366:5366 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:27:16 5367:5367 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0030746Z 2022-11-23T01:46:56.0031153Z STAGE:2022-11-23 01:27:16 5367:5367 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0031516Z STAGE:2022-11-23 01:27:16 5366:5366 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0031844Z STAGE:2022-11-23 01:27:17 5367:5367 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0032161Z STAGE:2022-11-23 01:27:17 5366:5366 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0032492Z STAGE:2022-11-23 01:27:17 5367:5367 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0032806Z STAGE:2022-11-23 01:27:17 5366:5366 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0033228Z STAGE:2022-11-23 01:27:17 5367:5367 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0033571Z STAGE:2022-11-23 01:27:17 5366:5366 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0033667Z ok (6.714s) 2022-11-23T01:46:56.0033687Z 2022-11-23T01:46:56.0033935Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0034045Z Ran 1 test in 6.715s 2022-11-23T01:46:56.0034065Z 2022-11-23T01:46:56.0034157Z OK 2022-11-23T01:46:56.0034176Z 2022-11-23T01:46:56.0034298Z Generating XML reports... 2022-11-23T01:46:56.0034739Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012710.xml 2022-11-23T01:46:56.0035112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0035284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0035667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0035856Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0035879Z 2022-11-23T01:46:56.0035970Z Running tests... 2022-11-23T01:46:56.0036231Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0036535Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0036798Z test_all_reduce_multigpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0037016Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5480 2022-11-23T01:46:56.0037224Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5481 2022-11-23T01:46:56.0037598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0037767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0038131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0038326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0038695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0038858Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0039224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0039405Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0039642Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0039880Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0040273Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0040703Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0040928Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0041148Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0041474Z STAGE:2022-11-23 01:27:25 5480:5480 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0042251Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T01:46:56.0042416Z warnings.warn( 2022-11-23T01:46:56.0042745Z STAGE:2022-11-23 01:27:25 5481:5481 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0043522Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T01:46:56.0043632Z warnings.warn( 2022-11-23T01:46:56.0043900Z [1669166845.827908] [08317a7e7676:5480 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0044116Z [1669166845.842621] [08317a7e7676:5480 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0044353Z [1669166845.842621] [08317a7e7676:5480 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0044627Z [1669166845.833029] [08317a7e7676:5481 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0044847Z [1669166845.847688] [08317a7e7676:5481 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0045080Z [1669166845.847688] [08317a7e7676:5481 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0045636Z STAGE:2022-11-23 01:27:26 5480:5480 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:27:26 5481:5481 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0045658Z 2022-11-23T01:46:56.0046215Z STAGE:2022-11-23 01:27:26 5481:5481 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:27:26 5480:5480 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0046238Z 2022-11-23T01:46:56.0046562Z STAGE:2022-11-23 01:27:26 5480:5480 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0046898Z STAGE:2022-11-23 01:27:26 5480:5480 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0047235Z STAGE:2022-11-23 01:27:26 5480:5480 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0047552Z STAGE:2022-11-23 01:27:26 5481:5481 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0047863Z STAGE:2022-11-23 01:27:26 5481:5481 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0048205Z STAGE:2022-11-23 01:27:26 5481:5481 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0048300Z ok (6.861s) 2022-11-23T01:46:56.0048319Z 2022-11-23T01:46:56.0048586Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0048758Z Ran 1 test in 6.861s 2022-11-23T01:46:56.0048779Z 2022-11-23T01:46:56.0048864Z OK 2022-11-23T01:46:56.0048883Z 2022-11-23T01:46:56.0049007Z Generating XML reports... 2022-11-23T01:46:56.0049504Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012720.xml 2022-11-23T01:46:56.0049872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0050049Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0050436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0050619Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0050639Z 2022-11-23T01:46:56.0050743Z Running tests... 2022-11-23T01:46:56.0051065Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0051368Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0051646Z test_all_reduce_multigpu_complex (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0051858Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5598 2022-11-23T01:46:56.0052060Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5599 2022-11-23T01:46:56.0052434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0052608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0052983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0053173Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0053536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0053707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0054089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0054263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0054508Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0054752Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0055155Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0055547Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0055780Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0056009Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0056334Z STAGE:2022-11-23 01:27:35 5598:5598 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0057105Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T01:46:56.0057218Z warnings.warn( 2022-11-23T01:46:56.0057530Z STAGE:2022-11-23 01:27:35 5599:5599 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0058288Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T01:46:56.0058400Z warnings.warn( 2022-11-23T01:46:56.0058714Z [1669166855.197279] [08317a7e7676:5599 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0058952Z [1669166855.212027] [08317a7e7676:5599 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0059192Z [1669166855.212027] [08317a7e7676:5599 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0059525Z STAGE:2022-11-23 01:27:35 5599:5599 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0059795Z [1669166855.194653] [08317a7e7676:5598 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0060071Z [1669166855.209722] [08317a7e7676:5598 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0060304Z [1669166855.209722] [08317a7e7676:5598 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0060628Z STAGE:2022-11-23 01:27:35 5598:5598 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0060976Z STAGE:2022-11-23 01:27:35 5599:5599 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0061313Z STAGE:2022-11-23 01:27:35 5598:5598 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0061633Z STAGE:2022-11-23 01:27:35 5599:5599 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0061958Z STAGE:2022-11-23 01:27:35 5598:5598 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0062284Z STAGE:2022-11-23 01:27:35 5599:5599 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0062626Z STAGE:2022-11-23 01:27:35 5599:5599 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0062954Z STAGE:2022-11-23 01:27:35 5598:5598 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0063284Z STAGE:2022-11-23 01:27:35 5598:5598 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0063368Z ok (6.847s) 2022-11-23T01:46:56.0063388Z 2022-11-23T01:46:56.0063650Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0063760Z Ran 1 test in 6.848s 2022-11-23T01:46:56.0063780Z 2022-11-23T01:46:56.0063863Z OK 2022-11-23T01:46:56.0063881Z 2022-11-23T01:46:56.0064004Z Generating XML reports... 2022-11-23T01:46:56.0064451Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012729.xml 2022-11-23T01:46:56.0064822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0064999Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0065370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0065561Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0065581Z 2022-11-23T01:46:56.0065681Z Running tests... 2022-11-23T01:46:56.0065943Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0066253Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0066506Z test_all_reduce_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0066721Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5716 2022-11-23T01:46:56.0066942Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5717 2022-11-23T01:46:56.0067306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0067511Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0067899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0068092Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0068451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0068625Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0069293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0069572Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0069818Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0070054Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0070465Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0070866Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0071092Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0071421Z STAGE:2022-11-23 01:27:42 5717:5717 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0071647Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0071973Z STAGE:2022-11-23 01:27:42 5716:5716 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0072241Z [1669166862.881860] [08317a7e7676:5716 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0072472Z [1669166864.524509] [08317a7e7676:5716 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0072708Z [1669166864.524509] [08317a7e7676:5716 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0072965Z [1669166862.904343] [08317a7e7676:5717 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0073185Z [1669166864.528220] [08317a7e7676:5717 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0073415Z [1669166864.528220] [08317a7e7676:5717 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0073978Z STAGE:2022-11-23 01:27:44 5716:5716 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:27:44 5717:5717 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0074000Z 2022-11-23T01:46:56.0074340Z STAGE:2022-11-23 01:27:44 5717:5717 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0074687Z STAGE:2022-11-23 01:27:44 5716:5716 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0075014Z STAGE:2022-11-23 01:27:45 5717:5717 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0075330Z STAGE:2022-11-23 01:27:45 5716:5716 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0075660Z STAGE:2022-11-23 01:27:45 5717:5717 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0075996Z STAGE:2022-11-23 01:27:45 5716:5716 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0076324Z STAGE:2022-11-23 01:27:45 5717:5717 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0076719Z STAGE:2022-11-23 01:27:45 5716:5716 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0076826Z ok (6.546s) 2022-11-23T01:46:56.0076845Z 2022-11-23T01:46:56.0077116Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0077230Z Ran 1 test in 6.546s 2022-11-23T01:46:56.0077250Z 2022-11-23T01:46:56.0077333Z OK 2022-11-23T01:46:56.0077351Z 2022-11-23T01:46:56.0077475Z Generating XML reports... 2022-11-23T01:46:56.0077921Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012739.xml 2022-11-23T01:46:56.0078295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0078508Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0078899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0079085Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0079104Z 2022-11-23T01:46:56.0079211Z Running tests... 2022-11-23T01:46:56.0079471Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0079777Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0080043Z test_all_reduce_result_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0080260Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5830 2022-11-23T01:46:56.0080461Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5831 2022-11-23T01:46:56.0080832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0081009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0081389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0081579Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0081946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0082118Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0082486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0082673Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0082908Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0083145Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0083552Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0083954Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0084178Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0084406Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0084678Z [1669166873.380663] [08317a7e7676:5830 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0084904Z [1669166873.394565] [08317a7e7676:5830 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0085142Z [1669166873.394565] [08317a7e7676:5830 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0085460Z [1669166873.380759] [08317a7e7676:5831 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0085697Z [1669166873.394538] [08317a7e7676:5831 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0085933Z [1669166873.394538] [08317a7e7676:5831 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0086024Z ok (6.157s) 2022-11-23T01:46:56.0086044Z 2022-11-23T01:46:56.0086309Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0086414Z Ran 1 test in 6.158s 2022-11-23T01:46:56.0086478Z 2022-11-23T01:46:56.0086569Z OK 2022-11-23T01:46:56.0086588Z 2022-11-23T01:46:56.0086712Z Generating XML reports... 2022-11-23T01:46:56.0087155Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012748.xml 2022-11-23T01:46:56.0087520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0087695Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0088076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0088260Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0088280Z 2022-11-23T01:46:56.0088384Z Running tests... 2022-11-23T01:46:56.0088643Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0088951Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0089208Z test_all_reduce_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0089411Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 5944 2022-11-23T01:46:56.0089633Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 5945 2022-11-23T01:46:56.0090009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0090178Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0090559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0090750Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0091108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0091285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0091662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0091836Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0092093Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0092329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0092729Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0093127Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0093349Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0093686Z STAGE:2022-11-23 01:28:00 5944:5944 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0093904Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0094274Z STAGE:2022-11-23 01:28:00 5945:5945 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0094541Z [1669166880.771742] [08317a7e7676:5945 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0094775Z [1669166882.440838] [08317a7e7676:5945 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0095014Z [1669166882.440838] [08317a7e7676:5945 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0095274Z [1669166880.750358] [08317a7e7676:5944 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0095549Z [1669166882.434578] [08317a7e7676:5944 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0095791Z [1669166882.434578] [08317a7e7676:5944 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0096340Z STAGE:2022-11-23 01:28:02 5945:5945 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:28:02 5944:5944 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0096361Z 2022-11-23T01:46:56.0096706Z STAGE:2022-11-23 01:28:02 5945:5945 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0097054Z STAGE:2022-11-23 01:28:02 5944:5944 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0097372Z STAGE:2022-11-23 01:28:02 5944:5944 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0097682Z STAGE:2022-11-23 01:28:02 5945:5945 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0098011Z STAGE:2022-11-23 01:28:02 5944:5944 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0098346Z STAGE:2022-11-23 01:28:02 5945:5945 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0098681Z STAGE:2022-11-23 01:28:02 5944:5944 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0099019Z STAGE:2022-11-23 01:28:02 5945:5945 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0099120Z ok (6.718s) 2022-11-23T01:46:56.0099139Z 2022-11-23T01:46:56.0099393Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0099501Z Ran 1 test in 6.718s 2022-11-23T01:46:56.0099521Z 2022-11-23T01:46:56.0099610Z OK 2022-11-23T01:46:56.0099629Z 2022-11-23T01:46:56.0099735Z Generating XML reports... 2022-11-23T01:46:56.0100183Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012756.xml 2022-11-23T01:46:56.0100559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0100738Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0101113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0101303Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0101323Z 2022-11-23T01:46:56.0101428Z Running tests... 2022-11-23T01:46:56.0101681Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0101991Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0102239Z test_all_reduce_sum_async (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0102465Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6058 2022-11-23T01:46:56.0102673Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6059 2022-11-23T01:46:56.0103091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0103268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0103651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0103843Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0104199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0104358Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0104732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0104961Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0105211Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0105455Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0105860Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0106251Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0106483Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0106811Z STAGE:2022-11-23 01:28:09 6058:6058 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0107029Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0107351Z STAGE:2022-11-23 01:28:09 6059:6059 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0107626Z [1669166889.950738] [08317a7e7676:6059 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0107848Z [1669166891.623352] [08317a7e7676:6059 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0108079Z [1669166891.623352] [08317a7e7676:6059 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0108345Z [1669166889.946959] [08317a7e7676:6058 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0108571Z [1669166891.603840] [08317a7e7676:6058 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0108809Z [1669166891.603840] [08317a7e7676:6058 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0109587Z STAGE:2022-11-23 01:28:11 6059:6059 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:28:11 6058:6058 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0109611Z 2022-11-23T01:46:56.0110184Z STAGE:2022-11-23 01:28:11 6058:6058 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:28:11 6059:6059 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0110204Z 2022-11-23T01:46:56.0110527Z STAGE:2022-11-23 01:28:12 6058:6058 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0110837Z STAGE:2022-11-23 01:28:12 6059:6059 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0111173Z STAGE:2022-11-23 01:28:12 6058:6058 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0111531Z STAGE:2022-11-23 01:28:12 6059:6059 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0111937Z STAGE:2022-11-23 01:28:12 6058:6058 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0112293Z STAGE:2022-11-23 01:28:12 6059:6059 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0112389Z ok (6.619s) 2022-11-23T01:46:56.0112408Z 2022-11-23T01:46:56.0112674Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0112788Z Ran 1 test in 6.619s 2022-11-23T01:46:56.0112807Z 2022-11-23T01:46:56.0112892Z OK 2022-11-23T01:46:56.0112911Z 2022-11-23T01:46:56.0113017Z Generating XML reports... 2022-11-23T01:46:56.0113469Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012806.xml 2022-11-23T01:46:56.0113914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0114084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0114471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0114658Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0114678Z 2022-11-23T01:46:56.0114784Z Running tests... 2022-11-23T01:46:56.0115046Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0115344Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0115615Z test_all_reduce_sum_complex (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0115831Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 6172 2022-11-23T01:46:56.0116052Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 6173 2022-11-23T01:46:56.0116423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0116598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0116977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0117168Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0117529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0117686Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0118055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0118251Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0118492Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0118736Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0119134Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0119533Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0119763Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0120086Z STAGE:2022-11-23 01:28:18 6173:6173 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0120295Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0120619Z STAGE:2022-11-23 01:28:19 6172:6172 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0120888Z [1669166899.096103] [08317a7e7676:6173 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0121164Z [1669166900.743682] [08317a7e7676:6173 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0121410Z [1669166900.743682] [08317a7e7676:6173 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0121674Z [1669166899.094441] [08317a7e7676:6172 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0121901Z [1669166900.758946] [08317a7e7676:6172 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0122134Z [1669166900.758946] [08317a7e7676:6172 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0122724Z STAGE:2022-11-23 01:28:21 6173:6173 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:28:21 6172:6172 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0122749Z 2022-11-23T01:46:56.0123095Z STAGE:2022-11-23 01:28:21 6173:6173 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0123423Z STAGE:2022-11-23 01:28:21 6172:6172 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0123751Z STAGE:2022-11-23 01:28:21 6173:6173 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0124079Z STAGE:2022-11-23 01:28:21 6172:6172 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0124406Z STAGE:2022-11-23 01:28:21 6173:6173 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0124735Z STAGE:2022-11-23 01:28:21 6172:6172 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0125076Z STAGE:2022-11-23 01:28:21 6173:6173 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0125408Z STAGE:2022-11-23 01:28:21 6172:6172 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0125508Z ok (6.645s) 2022-11-23T01:46:56.0125527Z 2022-11-23T01:46:56.0125783Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0125876Z Ran 1 test in 6.645s 2022-11-23T01:46:56.0125895Z 2022-11-23T01:46:56.0125983Z OK 2022-11-23T01:46:56.0126002Z 2022-11-23T01:46:56.0126123Z Generating XML reports... 2022-11-23T01:46:56.0126566Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012815.xml 2022-11-23T01:46:56.0126940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0127118Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0127499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0127685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0127704Z 2022-11-23T01:46:56.0127811Z Running tests... 2022-11-23T01:46:56.0128055Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0128358Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0128658Z test_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and NCCL backends will have CUDA allReduce tested (0.002s) 2022-11-23T01:46:56.0128677Z 2022-11-23T01:46:56.0128937Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0129037Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0129060Z 2022-11-23T01:46:56.0129165Z OK (skipped=1) 2022-11-23T01:46:56.0129184Z 2022-11-23T01:46:56.0129306Z Generating XML reports... 2022-11-23T01:46:56.0129752Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012824.xml 2022-11-23T01:46:56.0130167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0130334Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0130719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0130904Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0130923Z 2022-11-23T01:46:56.0131030Z Running tests... 2022-11-23T01:46:56.0131294Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0131600Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0131959Z test_all_reduce_sum_cuda_async (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and NCCL backends will have CUDA allReduce tested (0.002s) 2022-11-23T01:46:56.0131980Z 2022-11-23T01:46:56.0132244Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0132338Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0132368Z 2022-11-23T01:46:56.0132457Z OK (skipped=1) 2022-11-23T01:46:56.0132476Z 2022-11-23T01:46:56.0132595Z Generating XML reports... 2022-11-23T01:46:56.0133031Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012826.xml 2022-11-23T01:46:56.0133401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0133574Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0133946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0134140Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0134160Z 2022-11-23T01:46:56.0134265Z Running tests... 2022-11-23T01:46:56.0134515Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0134821Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0135131Z test_all_reduce_sum_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and NCCL backends will have CUDA allReduce tested (0.002s) 2022-11-23T01:46:56.0135151Z 2022-11-23T01:46:56.0135410Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0135515Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0135535Z 2022-11-23T01:46:56.0135642Z OK (skipped=1) 2022-11-23T01:46:56.0135662Z 2022-11-23T01:46:56.0135782Z Generating XML reports... 2022-11-23T01:46:56.0136221Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012829.xml 2022-11-23T01:46:56.0136593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0136753Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0137125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0137317Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0137337Z 2022-11-23T01:46:56.0137440Z Running tests... 2022-11-23T01:46:56.0137698Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0138000Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0138244Z test_all_to_all (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T01:46:56.0138267Z 2022-11-23T01:46:56.0138519Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0138627Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0138646Z 2022-11-23T01:46:56.0138735Z OK (skipped=1) 2022-11-23T01:46:56.0138754Z 2022-11-23T01:46:56.0138974Z Generating XML reports... 2022-11-23T01:46:56.0139419Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012831.xml 2022-11-23T01:46:56.0139787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0139954Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0140333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0140526Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0140595Z 2022-11-23T01:46:56.0140701Z Running tests... 2022-11-23T01:46:56.0140946Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0141257Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0141520Z test_all_to_all_complex (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T01:46:56.0141540Z 2022-11-23T01:46:56.0141792Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0141902Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0141922Z 2022-11-23T01:46:56.0142028Z OK (skipped=1) 2022-11-23T01:46:56.0142048Z 2022-11-23T01:46:56.0142168Z Generating XML reports... 2022-11-23T01:46:56.0142613Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012834.xml 2022-11-23T01:46:56.0142986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0143149Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0143529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0143722Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0143741Z 2022-11-23T01:46:56.0143845Z Running tests... 2022-11-23T01:46:56.0144108Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0144419Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0144678Z test_all_to_all_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL supports CUDA all_to_all (0.002s) 2022-11-23T01:46:56.0144697Z 2022-11-23T01:46:56.0144951Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0145059Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0145082Z 2022-11-23T01:46:56.0145172Z OK (skipped=1) 2022-11-23T01:46:56.0145192Z 2022-11-23T01:46:56.0145312Z Generating XML reports... 2022-11-23T01:46:56.0145760Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012836.xml 2022-11-23T01:46:56.0146141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0146316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0146700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0146892Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0146912Z 2022-11-23T01:46:56.0147018Z Running tests... 2022-11-23T01:46:56.0147262Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0147578Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0147853Z test_all_to_all_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL supports CUDA all_to_all (0.002s) 2022-11-23T01:46:56.0147872Z 2022-11-23T01:46:56.0148192Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0148309Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0148328Z 2022-11-23T01:46:56.0148437Z OK (skipped=1) 2022-11-23T01:46:56.0148456Z 2022-11-23T01:46:56.0148580Z Generating XML reports... 2022-11-23T01:46:56.0149202Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012838.xml 2022-11-23T01:46:56.0149591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0149750Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0150134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0150406Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0150425Z 2022-11-23T01:46:56.0150533Z Running tests... 2022-11-23T01:46:56.0150800Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0151114Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0151379Z test_all_to_all_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T01:46:56.0151399Z 2022-11-23T01:46:56.0151653Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0151762Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0151781Z 2022-11-23T01:46:56.0151870Z OK (skipped=1) 2022-11-23T01:46:56.0151889Z 2022-11-23T01:46:56.0152010Z Generating XML reports... 2022-11-23T01:46:56.0152459Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012841.xml 2022-11-23T01:46:56.0152839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0153014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0153398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0153590Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0153609Z 2022-11-23T01:46:56.0153713Z Running tests... 2022-11-23T01:46:56.0153960Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0154275Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0154551Z test_all_to_all_full_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL supports CUDA all_to_all (0.002s) 2022-11-23T01:46:56.0154575Z 2022-11-23T01:46:56.0154836Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0154949Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0154968Z 2022-11-23T01:46:56.0155073Z OK (skipped=1) 2022-11-23T01:46:56.0155092Z 2022-11-23T01:46:56.0155212Z Generating XML reports... 2022-11-23T01:46:56.0155660Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012843.xml 2022-11-23T01:46:56.0156034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0156193Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0156576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0156766Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0156786Z 2022-11-23T01:46:56.0156895Z Running tests... 2022-11-23T01:46:56.0157156Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0157467Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0157785Z test_all_to_all_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T01:46:56.0157806Z 2022-11-23T01:46:56.0158077Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0158189Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0158208Z 2022-11-23T01:46:56.0158297Z OK (skipped=1) 2022-11-23T01:46:56.0158315Z 2022-11-23T01:46:56.0158439Z Generating XML reports... 2022-11-23T01:46:56.0158888Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012846.xml 2022-11-23T01:46:56.0159265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0159491Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0159878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0160076Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0160096Z 2022-11-23T01:46:56.0160204Z Running tests... 2022-11-23T01:46:56.0160465Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0160760Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0161038Z test_all_to_all_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:46:56.0161057Z 2022-11-23T01:46:56.0161321Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0161431Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0161451Z 2022-11-23T01:46:56.0161559Z OK (skipped=1) 2022-11-23T01:46:56.0161578Z 2022-11-23T01:46:56.0161698Z Generating XML reports... 2022-11-23T01:46:56.0162144Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012848.xml 2022-11-23T01:46:56.0162524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0162699Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0163065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0163257Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0163277Z 2022-11-23T01:46:56.0163382Z Running tests... 2022-11-23T01:46:56.0163642Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0163954Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0164243Z test_all_to_all_single_equal_split (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:46:56.0164263Z 2022-11-23T01:46:56.0164522Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0164633Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0164653Z 2022-11-23T01:46:56.0164742Z OK (skipped=1) 2022-11-23T01:46:56.0164776Z 2022-11-23T01:46:56.0164882Z Generating XML reports... 2022-11-23T01:46:56.0165323Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012850.xml 2022-11-23T01:46:56.0165696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0165870Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0166249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0166443Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0166463Z 2022-11-23T01:46:56.0166569Z Running tests... 2022-11-23T01:46:56.0166831Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0167174Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0167481Z test_all_to_all_single_equal_split_complex (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:46:56.0167501Z 2022-11-23T01:46:56.0167767Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0167879Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0167898Z 2022-11-23T01:46:56.0168005Z OK (skipped=1) 2022-11-23T01:46:56.0168024Z 2022-11-23T01:46:56.0168147Z Generating XML reports... 2022-11-23T01:46:56.0168593Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012853.xml 2022-11-23T01:46:56.0169018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0169195Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0169568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0169763Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0169783Z 2022-11-23T01:46:56.0169890Z Running tests... 2022-11-23T01:46:56.0170150Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0170453Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0170751Z test_all_to_all_single_equal_split_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:46:56.0170775Z 2022-11-23T01:46:56.0171033Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0171143Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0171162Z 2022-11-23T01:46:56.0171266Z OK (skipped=1) 2022-11-23T01:46:56.0171286Z 2022-11-23T01:46:56.0171393Z Generating XML reports... 2022-11-23T01:46:56.0171841Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012855.xml 2022-11-23T01:46:56.0172215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0172390Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0172773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0172966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0172985Z 2022-11-23T01:46:56.0173096Z Running tests... 2022-11-23T01:46:56.0173355Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0173652Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0173963Z test_all_to_all_single_equal_split_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:46:56.0173983Z 2022-11-23T01:46:56.0174240Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0174350Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0174369Z 2022-11-23T01:46:56.0174474Z OK (skipped=1) 2022-11-23T01:46:56.0174494Z 2022-11-23T01:46:56.0174615Z Generating XML reports... 2022-11-23T01:46:56.0175054Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012857.xml 2022-11-23T01:46:56.0175424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0175604Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0175966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0176212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0176233Z 2022-11-23T01:46:56.0176345Z Running tests... 2022-11-23T01:46:56.0176609Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0176923Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0177226Z test_all_to_all_single_equal_split_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:46:56.0177245Z 2022-11-23T01:46:56.0177503Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0177616Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0177678Z 2022-11-23T01:46:56.0177789Z OK (skipped=1) 2022-11-23T01:46:56.0177808Z 2022-11-23T01:46:56.0177914Z Generating XML reports... 2022-11-23T01:46:56.0178365Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012900.xml 2022-11-23T01:46:56.0178745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0178920Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0179300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0179491Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0179511Z 2022-11-23T01:46:56.0179617Z Running tests... 2022-11-23T01:46:56.0179877Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0180189Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0180485Z test_all_to_all_single_equal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:46:56.0180506Z 2022-11-23T01:46:56.0180768Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0180877Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0180896Z 2022-11-23T01:46:56.0180998Z OK (skipped=1) 2022-11-23T01:46:56.0181017Z 2022-11-23T01:46:56.0181136Z Generating XML reports... 2022-11-23T01:46:56.0181579Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012902.xml 2022-11-23T01:46:56.0181954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0182129Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0182513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0182688Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0182708Z 2022-11-23T01:46:56.0182814Z Running tests... 2022-11-23T01:46:56.0183076Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0183386Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0183678Z test_all_to_all_single_equal_split_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:46:56.0183698Z 2022-11-23T01:46:56.0183958Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0184066Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0184086Z 2022-11-23T01:46:56.0184190Z OK (skipped=1) 2022-11-23T01:46:56.0184209Z 2022-11-23T01:46:56.0184328Z Generating XML reports... 2022-11-23T01:46:56.0184761Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012905.xml 2022-11-23T01:46:56.0185136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0185357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0185747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0185938Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0185960Z 2022-11-23T01:46:56.0186066Z Running tests... 2022-11-23T01:46:56.0186325Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0186634Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0186921Z test_all_to_all_single_equal_split_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:46:56.0187004Z 2022-11-23T01:46:56.0187252Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0187365Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0187384Z 2022-11-23T01:46:56.0187494Z OK (skipped=1) 2022-11-23T01:46:56.0187514Z 2022-11-23T01:46:56.0187638Z Generating XML reports... 2022-11-23T01:46:56.0188080Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012907.xml 2022-11-23T01:46:56.0188453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0188626Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0189215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0189399Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0189443Z 2022-11-23T01:46:56.0189534Z Running tests... 2022-11-23T01:46:56.0189799Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0190118Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0190407Z test_all_to_all_single_unequal_split (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:46:56.0190427Z 2022-11-23T01:46:56.0190684Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0190791Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0190811Z 2022-11-23T01:46:56.0190916Z OK (skipped=1) 2022-11-23T01:46:56.0190935Z 2022-11-23T01:46:56.0191055Z Generating XML reports... 2022-11-23T01:46:56.0191481Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012909.xml 2022-11-23T01:46:56.0191861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0192035Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0192416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0192607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0192627Z 2022-11-23T01:46:56.0192737Z Running tests... 2022-11-23T01:46:56.0192997Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0193306Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0193592Z test_all_to_all_single_unequal_split_complex (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:46:56.0193628Z 2022-11-23T01:46:56.0193870Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0193984Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0194003Z 2022-11-23T01:46:56.0194108Z OK (skipped=1) 2022-11-23T01:46:56.0194127Z 2022-11-23T01:46:56.0194248Z Generating XML reports... 2022-11-23T01:46:56.0194761Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012912.xml 2022-11-23T01:46:56.0195143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0195320Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0195699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0195888Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0195908Z 2022-11-23T01:46:56.0195997Z Running tests... 2022-11-23T01:46:56.0196263Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0196645Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0196945Z test_all_to_all_single_unequal_split_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:46:56.0196969Z 2022-11-23T01:46:56.0197233Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0197343Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0197362Z 2022-11-23T01:46:56.0197468Z OK (skipped=1) 2022-11-23T01:46:56.0197487Z 2022-11-23T01:46:56.0197612Z Generating XML reports... 2022-11-23T01:46:56.0198062Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012914.xml 2022-11-23T01:46:56.0198422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0198598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0198988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0199179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0199199Z 2022-11-23T01:46:56.0199307Z Running tests... 2022-11-23T01:46:56.0199573Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0199888Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0200203Z test_all_to_all_single_unequal_split_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:46:56.0200223Z 2022-11-23T01:46:56.0200483Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0200576Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0200595Z 2022-11-23T01:46:56.0200702Z OK (skipped=1) 2022-11-23T01:46:56.0200724Z 2022-11-23T01:46:56.0200847Z Generating XML reports... 2022-11-23T01:46:56.0201291Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012917.xml 2022-11-23T01:46:56.0201666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0201843Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0202225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0202416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0202436Z 2022-11-23T01:46:56.0202525Z Running tests... 2022-11-23T01:46:56.0202786Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0203100Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0203410Z test_all_to_all_single_unequal_split_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:46:56.0203430Z 2022-11-23T01:46:56.0203686Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0203844Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0203865Z 2022-11-23T01:46:56.0203975Z OK (skipped=1) 2022-11-23T01:46:56.0203994Z 2022-11-23T01:46:56.0204117Z Generating XML reports... 2022-11-23T01:46:56.0209444Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012919.xml 2022-11-23T01:46:56.0209835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0209997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0210384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0210675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0210697Z 2022-11-23T01:46:56.0210806Z Running tests... 2022-11-23T01:46:56.0211122Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0211447Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0211762Z test_all_to_all_single_unequal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:46:56.0211783Z 2022-11-23T01:46:56.0212048Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0212143Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0212179Z 2022-11-23T01:46:56.0212268Z OK (skipped=1) 2022-11-23T01:46:56.0212288Z 2022-11-23T01:46:56.0212408Z Generating XML reports... 2022-11-23T01:46:56.0212859Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012921.xml 2022-11-23T01:46:56.0213240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0213414Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0213803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0213994Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0214015Z 2022-11-23T01:46:56.0214120Z Running tests... 2022-11-23T01:46:56.0214371Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0214688Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0214986Z test_all_to_all_single_unequal_split_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T01:46:56.0215010Z 2022-11-23T01:46:56.0215273Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0215385Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0215405Z 2022-11-23T01:46:56.0215511Z OK (skipped=1) 2022-11-23T01:46:56.0215530Z 2022-11-23T01:46:56.0215659Z Generating XML reports... 2022-11-23T01:46:56.0216108Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012924.xml 2022-11-23T01:46:56.0216485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0216646Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0217032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0217227Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0217247Z 2022-11-23T01:46:56.0217358Z Running tests... 2022-11-23T01:46:56.0217621Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0217936Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0218299Z test_all_to_all_single_unequal_split_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T01:46:56.0218322Z 2022-11-23T01:46:56.0218591Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0218704Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0218724Z 2022-11-23T01:46:56.0218813Z OK (skipped=1) 2022-11-23T01:46:56.0218832Z 2022-11-23T01:46:56.0218957Z Generating XML reports... 2022-11-23T01:46:56.0219404Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012926.xml 2022-11-23T01:46:56.0219775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0220004Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0220392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0220591Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0220611Z 2022-11-23T01:46:56.0220720Z Running tests... 2022-11-23T01:46:56.0220969Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0221283Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0221551Z test_average_parameters (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0221772Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7177 2022-11-23T01:46:56.0221994Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7178 2022-11-23T01:46:56.0222373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0222552Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0222941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0223136Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0223491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0223666Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0224044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0224238Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0224487Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0224741Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0225154Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0225556Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0225788Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0226003Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0226282Z [1669166975.370642] [08317a7e7676:7178 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0226515Z [1669166975.384124] [08317a7e7676:7178 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0226761Z [1669166975.384124] [08317a7e7676:7178 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0227080Z [1669166975.362649] [08317a7e7676:7177 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0227318Z [1669166975.377177] [08317a7e7676:7177 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0227555Z [1669166975.377177] [08317a7e7676:7177 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0227805Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.0228054Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.0228513Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0228899Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0229324Z ok (7.388s) 2022-11-23T01:46:56.0229348Z 2022-11-23T01:46:56.0229628Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0229742Z Ran 1 test in 7.388s 2022-11-23T01:46:56.0229762Z 2022-11-23T01:46:56.0229856Z OK 2022-11-23T01:46:56.0229876Z 2022-11-23T01:46:56.0230001Z Generating XML reports... 2022-11-23T01:46:56.0230457Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012929.xml 2022-11-23T01:46:56.0230837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0231001Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0231392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0231587Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0231607Z 2022-11-23T01:46:56.0231717Z Running tests... 2022-11-23T01:46:56.0231982Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0232298Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0232563Z test_backend_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0232786Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7301 2022-11-23T01:46:56.0233008Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7302 2022-11-23T01:46:56.0233365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0233549Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0233933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0234133Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0234502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0234679Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0235060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0235252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0235484Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0235733Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0236143Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0236630Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0236873Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0237106Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0237258Z skip: Need at least 3 CUDA devices (4.291s) 2022-11-23T01:46:56.0237279Z 2022-11-23T01:46:56.0237549Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0237662Z Ran 1 test in 4.291s 2022-11-23T01:46:56.0237682Z 2022-11-23T01:46:56.0237770Z OK (skipped=1) 2022-11-23T01:46:56.0237790Z 2022-11-23T01:46:56.0237984Z Generating XML reports... 2022-11-23T01:46:56.0238443Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012939.xml 2022-11-23T01:46:56.0238822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0239001Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0239384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0239578Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0239598Z 2022-11-23T01:46:56.0239705Z Running tests... 2022-11-23T01:46:56.0239970Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0240270Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0240524Z test_backend_group (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 3 (0.002s) 2022-11-23T01:46:56.0240549Z 2022-11-23T01:46:56.0240811Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0240923Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0240943Z 2022-11-23T01:46:56.0241053Z OK (skipped=1) 2022-11-23T01:46:56.0241073Z 2022-11-23T01:46:56.0241196Z Generating XML reports... 2022-11-23T01:46:56.0241645Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012945.xml 2022-11-23T01:46:56.0242022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0242181Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0242568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0242767Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0242791Z 2022-11-23T01:46:56.0242899Z Running tests... 2022-11-23T01:46:56.0243164Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0243480Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0243732Z test_barrier (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support CPU barrier (0.002s) 2022-11-23T01:46:56.0243752Z 2022-11-23T01:46:56.0244014Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0244126Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0244145Z 2022-11-23T01:46:56.0244233Z OK (skipped=1) 2022-11-23T01:46:56.0244252Z 2022-11-23T01:46:56.0244380Z Generating XML reports... 2022-11-23T01:46:56.0244832Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012948.xml 2022-11-23T01:46:56.0245207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0245389Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0245824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0246025Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0246046Z 2022-11-23T01:46:56.0246154Z Running tests... 2022-11-23T01:46:56.0246420Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0246715Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0246971Z test_barrier_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0247191Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7470 2022-11-23T01:46:56.0247411Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7471 2022-11-23T01:46:56.0247839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0248016Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0248402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0248598Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0248946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0249120Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0249496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0249685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0249940Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0250186Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0250593Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0250993Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0251228Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0251440Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0251716Z [1669166996.029882] [08317a7e7676:7471 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0251954Z [1669166996.044329] [08317a7e7676:7471 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0252193Z [1669166996.044329] [08317a7e7676:7471 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0252467Z [1669166996.029859] [08317a7e7676:7470 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0252696Z [1669166996.044395] [08317a7e7676:7470 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0252934Z [1669166996.044395] [08317a7e7676:7470 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0253036Z ok (6.964s) 2022-11-23T01:46:56.0253056Z 2022-11-23T01:46:56.0253325Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0253420Z Ran 1 test in 6.965s 2022-11-23T01:46:56.0253460Z 2022-11-23T01:46:56.0253534Z OK 2022-11-23T01:46:56.0253552Z 2022-11-23T01:46:56.0253676Z Generating XML reports... 2022-11-23T01:46:56.0254131Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012950.xml 2022-11-23T01:46:56.0254557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0254743Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0255136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0255331Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0255351Z 2022-11-23T01:46:56.0255459Z Running tests... 2022-11-23T01:46:56.0255704Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0256019Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0256340Z test_barrier_full_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support CPU barrier (0.002s) 2022-11-23T01:46:56.0256360Z 2022-11-23T01:46:56.0256625Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0256736Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0256756Z 2022-11-23T01:46:56.0256865Z OK (skipped=1) 2022-11-23T01:46:56.0256884Z 2022-11-23T01:46:56.0257004Z Generating XML reports... 2022-11-23T01:46:56.0257455Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013000.xml 2022-11-23T01:46:56.0257831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0257991Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0258378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0258576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0258596Z 2022-11-23T01:46:56.0258704Z Running tests... 2022-11-23T01:46:56.0258974Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0259292Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0259566Z test_barrier_full_group_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0259789Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7617 2022-11-23T01:46:56.0259992Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7618 2022-11-23T01:46:56.0260369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0260543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0260937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0261128Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0261498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0261676Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0262056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0262249Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0262482Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0262728Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0263134Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0263535Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0263818Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0264059Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0264220Z skip: Skipped due to small world size. (4.237s) 2022-11-23T01:46:56.0264240Z 2022-11-23T01:46:56.0264510Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0264622Z Ran 1 test in 4.237s 2022-11-23T01:46:56.0264643Z 2022-11-23T01:46:56.0264732Z OK (skipped=1) 2022-11-23T01:46:56.0264750Z 2022-11-23T01:46:56.0264875Z Generating XML reports... 2022-11-23T01:46:56.0265324Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013002.xml 2022-11-23T01:46:56.0265754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0265936Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0266323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0266519Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0266539Z 2022-11-23T01:46:56.0266646Z Running tests... 2022-11-23T01:46:56.0266895Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0267210Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0267474Z test_barrier_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support CPU barrier (0.002s) 2022-11-23T01:46:56.0267498Z 2022-11-23T01:46:56.0267760Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0267870Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0267889Z 2022-11-23T01:46:56.0267995Z OK (skipped=1) 2022-11-23T01:46:56.0268014Z 2022-11-23T01:46:56.0268140Z Generating XML reports... 2022-11-23T01:46:56.0268593Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013009.xml 2022-11-23T01:46:56.0269193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0269367Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0269761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0269956Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0269976Z 2022-11-23T01:46:56.0270088Z Running tests... 2022-11-23T01:46:56.0270353Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0270674Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0270940Z test_barrier_group_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0271161Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 7753 2022-11-23T01:46:56.0271379Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 7754 2022-11-23T01:46:56.0271739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0271917Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0272300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0272494Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0272869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0273043Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0273505Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0273710Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0273942Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0274189Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0274601Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0275005Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0275349Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0275584Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0275744Z skip: Skipped due to small world size. (4.118s) 2022-11-23T01:46:56.0275764Z 2022-11-23T01:46:56.0276034Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0276147Z Ran 1 test in 4.118s 2022-11-23T01:46:56.0276167Z 2022-11-23T01:46:56.0276256Z OK (skipped=1) 2022-11-23T01:46:56.0276275Z 2022-11-23T01:46:56.0276399Z Generating XML reports... 2022-11-23T01:46:56.0276850Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013011.xml 2022-11-23T01:46:56.0277225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0277406Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0277790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0277987Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0278007Z 2022-11-23T01:46:56.0278116Z Running tests... 2022-11-23T01:46:56.0278384Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0278681Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0278966Z test_barrier_timeout_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only gloo backend supports timeouts (0.002s) 2022-11-23T01:46:56.0278987Z 2022-11-23T01:46:56.0279246Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0279357Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0279380Z 2022-11-23T01:46:56.0279487Z OK (skipped=1) 2022-11-23T01:46:56.0279507Z 2022-11-23T01:46:56.0279627Z Generating XML reports... 2022-11-23T01:46:56.0280078Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013018.xml 2022-11-23T01:46:56.0280458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0280623Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0281004Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0281196Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0281216Z 2022-11-23T01:46:56.0281321Z Running tests... 2022-11-23T01:46:56.0281587Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0281900Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0282183Z test_barrier_timeout_global (__main__.TestDistBackendWithSpawn) ... skip: Only gloo backend supports timeouts (0.002s) 2022-11-23T01:46:56.0282203Z 2022-11-23T01:46:56.0282465Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0282625Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0282647Z 2022-11-23T01:46:56.0282740Z OK (skipped=1) 2022-11-23T01:46:56.0282779Z 2022-11-23T01:46:56.0282885Z Generating XML reports... 2022-11-23T01:46:56.0283334Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013020.xml 2022-11-23T01:46:56.0283711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0283887Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0284274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0284522Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0284541Z 2022-11-23T01:46:56.0284648Z Running tests... 2022-11-23T01:46:56.0284916Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0285213Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0285493Z test_barrier_timeout_group (__main__.TestDistBackendWithSpawn) ... skip: Only gloo backend supports timeouts (0.002s) 2022-11-23T01:46:56.0285512Z 2022-11-23T01:46:56.0285772Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0285883Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0285903Z 2022-11-23T01:46:56.0286009Z OK (skipped=1) 2022-11-23T01:46:56.0286028Z 2022-11-23T01:46:56.0286147Z Generating XML reports... 2022-11-23T01:46:56.0286596Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013023.xml 2022-11-23T01:46:56.0286972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0287149Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0287519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0287716Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0287736Z 2022-11-23T01:46:56.0287843Z Running tests... 2022-11-23T01:46:56.0288106Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0288420Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0288678Z test_batch_isend_irecv_gloo (__main__.TestDistBackendWithSpawn) ... skip: GLOO Batch Send Recv CPU (0.002s) 2022-11-23T01:46:56.0288702Z 2022-11-23T01:46:56.0288963Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0289077Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0289096Z 2022-11-23T01:46:56.0289185Z OK (skipped=1) 2022-11-23T01:46:56.0289222Z 2022-11-23T01:46:56.0289327Z Generating XML reports... 2022-11-23T01:46:56.0289779Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013025.xml 2022-11-23T01:46:56.0290156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0290333Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0290715Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0290907Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0290926Z 2022-11-23T01:46:56.0291031Z Running tests... 2022-11-23T01:46:56.0291301Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0291598Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0291913Z test_batch_isend_irecv_gloo_tags (__main__.TestDistBackendWithSpawn) ... skip: GLOO Batch Send Recv CPU (0.002s) 2022-11-23T01:46:56.0291937Z 2022-11-23T01:46:56.0292205Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0292316Z Ran 1 test in 0.003s 2022-11-23T01:46:56.0292336Z 2022-11-23T01:46:56.0292442Z OK (skipped=1) 2022-11-23T01:46:56.0292461Z 2022-11-23T01:46:56.0292585Z Generating XML reports... 2022-11-23T01:46:56.0293028Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013027.xml 2022-11-23T01:46:56.0293403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0293629Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0293998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0294190Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0294214Z 2022-11-23T01:46:56.0294323Z Running tests... 2022-11-23T01:46:56.0294585Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0294897Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0295175Z test_batch_isend_irecv_mixed_backend_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T01:46:56.0295195Z 2022-11-23T01:46:56.0295455Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0295567Z Ran 1 test in 0.003s 2022-11-23T01:46:56.0295587Z 2022-11-23T01:46:56.0295675Z OK (skipped=1) 2022-11-23T01:46:56.0295715Z 2022-11-23T01:46:56.0295821Z Generating XML reports... 2022-11-23T01:46:56.0296269Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013030.xml 2022-11-23T01:46:56.0296650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0296827Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0297210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0297403Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0297423Z 2022-11-23T01:46:56.0297528Z Running tests... 2022-11-23T01:46:56.0297791Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0298087Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0298351Z test_batch_isend_irecv_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.003s) 2022-11-23T01:46:56.0298372Z 2022-11-23T01:46:56.0298634Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0298745Z Ran 1 test in 0.003s 2022-11-23T01:46:56.0298768Z 2022-11-23T01:46:56.0298873Z OK (skipped=1) 2022-11-23T01:46:56.0298892Z 2022-11-23T01:46:56.0299016Z Generating XML reports... 2022-11-23T01:46:56.0299463Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013032.xml 2022-11-23T01:46:56.0299835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0300011Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0300376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0300574Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0300594Z 2022-11-23T01:46:56.0300702Z Running tests... 2022-11-23T01:46:56.0300965Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0301327Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0301608Z test_batch_isend_irecv_no_rank_zero_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.003s) 2022-11-23T01:46:56.0301628Z 2022-11-23T01:46:56.0301887Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0302001Z Ran 1 test in 0.003s 2022-11-23T01:46:56.0302020Z 2022-11-23T01:46:56.0302126Z OK (skipped=1) 2022-11-23T01:46:56.0302145Z 2022-11-23T01:46:56.0302251Z Generating XML reports... 2022-11-23T01:46:56.0302696Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013035.xml 2022-11-23T01:46:56.0303125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0303302Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0303688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0303882Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0303902Z 2022-11-23T01:46:56.0304010Z Running tests... 2022-11-23T01:46:56.0304271Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0304566Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0304830Z test_batch_isend_irecv_op_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T01:46:56.0304849Z 2022-11-23T01:46:56.0305110Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0305227Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0305247Z 2022-11-23T01:46:56.0305353Z OK (skipped=1) 2022-11-23T01:46:56.0305373Z 2022-11-23T01:46:56.0305495Z Generating XML reports... 2022-11-23T01:46:56.0305944Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013037.xml 2022-11-23T01:46:56.0306318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0306497Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0306862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0307058Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0307077Z 2022-11-23T01:46:56.0307185Z Running tests... 2022-11-23T01:46:56.0307446Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0307763Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0308029Z test_batch_isend_irecv_op_list_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T01:46:56.0308053Z 2022-11-23T01:46:56.0308313Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0308425Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0308444Z 2022-11-23T01:46:56.0308549Z OK (skipped=1) 2022-11-23T01:46:56.0308568Z 2022-11-23T01:46:56.0308672Z Generating XML reports... 2022-11-23T01:46:56.0309279Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013039.xml 2022-11-23T01:46:56.0309672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0309848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0310239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0310433Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0310453Z 2022-11-23T01:46:56.0310639Z Running tests... 2022-11-23T01:46:56.0310914Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0311249Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0311532Z test_batch_isend_irecv_ring_exchange_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T01:46:56.0311551Z 2022-11-23T01:46:56.0311807Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0311914Z Ran 1 test in 0.003s 2022-11-23T01:46:56.0311933Z 2022-11-23T01:46:56.0312037Z OK (skipped=1) 2022-11-23T01:46:56.0312056Z 2022-11-23T01:46:56.0312242Z Generating XML reports... 2022-11-23T01:46:56.0312696Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013042.xml 2022-11-23T01:46:56.0313077Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0313254Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0313623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0313815Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0313835Z 2022-11-23T01:46:56.0313943Z Running tests... 2022-11-23T01:46:56.0314207Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0314519Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0314785Z test_batch_isend_irecv_self_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T01:46:56.0314809Z 2022-11-23T01:46:56.0315066Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0315178Z Ran 1 test in 0.003s 2022-11-23T01:46:56.0315197Z 2022-11-23T01:46:56.0315306Z OK (skipped=1) 2022-11-23T01:46:56.0315327Z 2022-11-23T01:46:56.0315433Z Generating XML reports... 2022-11-23T01:46:56.0315879Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013044.xml 2022-11-23T01:46:56.0316254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0316429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0316814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0317012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0317032Z 2022-11-23T01:46:56.0317137Z Running tests... 2022-11-23T01:46:56.0317400Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0317715Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0317967Z test_batch_isend_irecv_tensor_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T01:46:56.0317987Z 2022-11-23T01:46:56.0318248Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0318358Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0318378Z 2022-11-23T01:46:56.0318482Z OK (skipped=1) 2022-11-23T01:46:56.0318501Z 2022-11-23T01:46:56.0318621Z Generating XML reports... 2022-11-23T01:46:56.0319067Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013046.xml 2022-11-23T01:46:56.0319445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0319621Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0320035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0320237Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0320256Z 2022-11-23T01:46:56.0320364Z Running tests... 2022-11-23T01:46:56.0320631Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0320941Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0321191Z test_broadcast (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0321413Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8285 2022-11-23T01:46:56.0321680Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8286 2022-11-23T01:46:56.0322054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0322212Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0322599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0322793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0323163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0323337Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0323716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0323907Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0324165Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0324399Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0324805Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0325209Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0325443Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0325782Z STAGE:2022-11-23 01:30:53 8286:8286 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0326015Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0326345Z STAGE:2022-11-23 01:30:53 8285:8285 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0326629Z [1669167053.248933] [08317a7e7676:8285 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0326864Z [1669167054.910511] [08317a7e7676:8285 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0327101Z [1669167054.910511] [08317a7e7676:8285 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0327358Z [1669167053.250504] [08317a7e7676:8286 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0327585Z [1669167054.887872] [08317a7e7676:8286 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0327825Z [1669167054.887872] [08317a7e7676:8286 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0328389Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0328411Z 2022-11-23T01:46:56.0328811Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0329167Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0329496Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0329832Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0330179Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0330503Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0330871Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0331203Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0331553Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0331879Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0332212Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0332540Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0332887Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0333223Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0333551Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0333864Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0334409Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0334430Z 2022-11-23T01:46:56.0334998Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0335018Z 2022-11-23T01:46:56.0335341Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0335669Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0335999Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0336552Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0336573Z 2022-11-23T01:46:56.0336918Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0337243Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0337563Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0337895Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0338218Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0338563Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0338967Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0339306Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0339630Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0339963Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0340306Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0340640Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0341038Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0341344Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0341670Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0342001Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0342327Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0342670Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0343008Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0343328Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0343659Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0343994Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0344306Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0344648Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0344992Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0345316Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0345642Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0346185Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0346210Z 2022-11-23T01:46:56.0346774Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0346795Z 2022-11-23T01:46:56.0347119Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0347442Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0347779Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0348095Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0348440Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0348790Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0349423Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0350019Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0350369Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0350695Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0351037Z STAGE:2022-11-23 01:30:55 8285:8285 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0351378Z STAGE:2022-11-23 01:30:55 8286:8286 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0351463Z ok (6.652s) 2022-11-23T01:46:56.0351483Z 2022-11-23T01:46:56.0351813Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0351926Z Ran 1 test in 6.652s 2022-11-23T01:46:56.0351946Z 2022-11-23T01:46:56.0352036Z OK 2022-11-23T01:46:56.0352055Z 2022-11-23T01:46:56.0352178Z Generating XML reports... 2022-11-23T01:46:56.0352639Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013049.xml 2022-11-23T01:46:56.0353020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0353199Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0353567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0353763Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0353782Z 2022-11-23T01:46:56.0353891Z Running tests... 2022-11-23T01:46:56.0354159Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0354472Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0354762Z test_broadcast_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and Nccl backend supports CUDA allReduce (0.002s) 2022-11-23T01:46:56.0354782Z 2022-11-23T01:46:56.0355044Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0355153Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0355172Z 2022-11-23T01:46:56.0355278Z OK (skipped=1) 2022-11-23T01:46:56.0355298Z 2022-11-23T01:46:56.0355402Z Generating XML reports... 2022-11-23T01:46:56.0355848Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013058.xml 2022-11-23T01:46:56.0356227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0356405Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0356789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0356982Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0357005Z 2022-11-23T01:46:56.0357112Z Running tests... 2022-11-23T01:46:56.0357374Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0357688Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0357938Z test_broadcast_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0358158Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8432 2022-11-23T01:46:56.0358375Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8433 2022-11-23T01:46:56.0358749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0358930Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0359363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0359564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0359933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0360089Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0360467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0360655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0360903Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0361202Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0361610Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0362010Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0362243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0362486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.0362700Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0362946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.0363341Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0363739Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0364080Z STAGE:2022-11-23 01:31:04 8433:8433 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0364410Z STAGE:2022-11-23 01:31:04 8432:8432 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0364690Z [1669167064.906802] [08317a7e7676:8433 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0364921Z [1669167066.557759] [08317a7e7676:8433 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0365159Z [1669167066.557759] [08317a7e7676:8433 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0365438Z [1669167064.886402] [08317a7e7676:8432 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0365650Z [1669167066.516737] [08317a7e7676:8432 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0365890Z [1669167066.516737] [08317a7e7676:8432 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0366446Z STAGE:2022-11-23 01:31:06 8433:8433 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:31:06 8432:8432 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0366468Z 2022-11-23T01:46:56.0366817Z STAGE:2022-11-23 01:31:06 8433:8433 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0367162Z STAGE:2022-11-23 01:31:06 8432:8432 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0367493Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0367817Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0368198Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0368534Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0368877Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0369202Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0369523Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0369848Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0370233Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0370559Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0370905Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0371246Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0371572Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0371879Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0372214Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0372768Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0372791Z 2022-11-23T01:46:56.0373133Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0373458Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0373782Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0374110Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0374434Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0374782Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0375123Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0375432Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0375752Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0376087Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0376642Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0376663Z 2022-11-23T01:46:56.0377002Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0377323Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0377645Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0377979Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0378304Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0378694Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0379031Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0379352Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0379674Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0380002Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0380326Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0380720Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0381064Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0381391Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0381695Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0382024Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0382351Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0382694Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0383031Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0383355Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0383682Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0384022Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0384573Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0384594Z 2022-11-23T01:46:56.0384936Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0385238Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0385561Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0385889Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0386218Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0386562Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0386906Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0387229Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0387552Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0387888Z STAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0388428Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:31:07 8432:8432 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0388471Z 2022-11-23T01:46:56.0388843Z STAGE:2022-11-23 01:31:07 8433:8433 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0389119Z ok (6.655s) 2022-11-23T01:46:56.0389140Z 2022-11-23T01:46:56.0389420Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0389535Z Ran 1 test in 6.656s 2022-11-23T01:46:56.0389555Z 2022-11-23T01:46:56.0389647Z OK 2022-11-23T01:46:56.0389667Z 2022-11-23T01:46:56.0389793Z Generating XML reports... 2022-11-23T01:46:56.0390253Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013100.xml 2022-11-23T01:46:56.0390633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0390872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0391264Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0391470Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0391489Z 2022-11-23T01:46:56.0391597Z Running tests... 2022-11-23T01:46:56.0391863Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0392178Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0392442Z test_broadcast_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0392663Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8546 2022-11-23T01:46:56.0392887Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8547 2022-11-23T01:46:56.0393247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0393429Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0393817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0394012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0394383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0394555Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0394935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0395125Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0395360Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0395612Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0396018Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0396422Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0396653Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0396877Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0397038Z skip: Skipped due to small world size. (4.218s) 2022-11-23T01:46:56.0397057Z 2022-11-23T01:46:56.0397324Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0397436Z Ran 1 test in 4.218s 2022-11-23T01:46:56.0397456Z 2022-11-23T01:46:56.0397548Z OK (skipped=1) 2022-11-23T01:46:56.0397567Z 2022-11-23T01:46:56.0397691Z Generating XML reports... 2022-11-23T01:46:56.0398144Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013110.xml 2022-11-23T01:46:56.0398584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0398769Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0399158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0399351Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0399371Z 2022-11-23T01:46:56.0399477Z Running tests... 2022-11-23T01:46:56.0399741Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0400039Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0400353Z test_broadcast_multigpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0400573Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8649 2022-11-23T01:46:56.0400793Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8650 2022-11-23T01:46:56.0401169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0401347Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0401734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0401926Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0402278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0402455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0402836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0403027Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0403278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0403524Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0403929Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0404330Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0404561Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0404779Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0405571Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1402: UserWarning: torch.distributed.broadcast_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T01:46:56.0405687Z warnings.warn( 2022-11-23T01:46:56.0406463Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1402: UserWarning: torch.distributed.broadcast_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T01:46:56.0406570Z warnings.warn( 2022-11-23T01:46:56.0406842Z [1669167082.251418] [08317a7e7676:8650 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0407078Z [1669167082.264788] [08317a7e7676:8650 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0407377Z [1669167082.264788] [08317a7e7676:8650 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0407658Z [1669167082.250398] [08317a7e7676:8649 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0407890Z [1669167082.264329] [08317a7e7676:8649 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0408129Z [1669167082.264329] [08317a7e7676:8649 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0408214Z ok (6.163s) 2022-11-23T01:46:56.0408234Z 2022-11-23T01:46:56.0408508Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0408667Z Ran 1 test in 6.163s 2022-11-23T01:46:56.0408687Z 2022-11-23T01:46:56.0408779Z OK 2022-11-23T01:46:56.0408797Z 2022-11-23T01:46:56.0408921Z Generating XML reports... 2022-11-23T01:46:56.0409378Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013116.xml 2022-11-23T01:46:56.0409753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0409930Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0410297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0410490Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0410509Z 2022-11-23T01:46:56.0410619Z Running tests... 2022-11-23T01:46:56.0410881Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0411244Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0411513Z test_broadcast_object_list (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0412270Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82847 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.654s) 2022-11-23T01:46:56.0412291Z 2022-11-23T01:46:56.0412555Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0412669Z Ran 1 test in 1.654s 2022-11-23T01:46:56.0412688Z 2022-11-23T01:46:56.0412794Z OK (skipped=1) 2022-11-23T01:46:56.0412814Z 2022-11-23T01:46:56.0412920Z Generating XML reports... 2022-11-23T01:46:56.0413370Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013125.xml 2022-11-23T01:46:56.0413750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0413925Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0414310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0414504Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0414524Z 2022-11-23T01:46:56.0414631Z Running tests... 2022-11-23T01:46:56.0414894Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0415188Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0415508Z test_compute_bucket_assignment_by_size_sparse_error_with_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0416269Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/85012 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.664s) 2022-11-23T01:46:56.0416340Z 2022-11-23T01:46:56.0416611Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0416723Z Ran 1 test in 1.664s 2022-11-23T01:46:56.0416743Z 2022-11-23T01:46:56.0416851Z OK (skipped=1) 2022-11-23T01:46:56.0416871Z 2022-11-23T01:46:56.0416995Z Generating XML reports... 2022-11-23T01:46:56.0417445Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013129.xml 2022-11-23T01:46:56.0417820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0417999Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0418423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0418622Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0418641Z 2022-11-23T01:46:56.0418752Z Running tests... 2022-11-23T01:46:56.0419014Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0419327Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0419650Z test_compute_bucket_assignment_by_size_sparse_error_without_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0420401Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/85339 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.603s) 2022-11-23T01:46:56.0420424Z 2022-11-23T01:46:56.0420683Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0420794Z Ran 1 test in 1.604s 2022-11-23T01:46:56.0420814Z 2022-11-23T01:46:56.0420917Z OK (skipped=1) 2022-11-23T01:46:56.0420939Z 2022-11-23T01:46:56.0421045Z Generating XML reports... 2022-11-23T01:46:56.0421491Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013134.xml 2022-11-23T01:46:56.0421863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0422038Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0422422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0422618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0422641Z 2022-11-23T01:46:56.0422749Z Running tests... 2022-11-23T01:46:56.0423013Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0423308Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0423579Z test_ddp_broadcast_buffer (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0423800Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8865 2022-11-23T01:46:56.0424018Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8866 2022-11-23T01:46:56.0424392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0424568Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0424950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0425147Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0425518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0425778Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0426173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0426366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0426614Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0426862Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0427266Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0427724Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0427959Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0428199Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0428444Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp49us5due 2022-11-23T01:46:56.0428721Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp49us5due/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0429249Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkfhkb5ad 2022-11-23T01:46:56.0429538Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkfhkb5ad/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0429818Z [1669167103.416772] [08317a7e7676:8866 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0430058Z [1669167103.430185] [08317a7e7676:8866 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0430301Z [1669167103.430185] [08317a7e7676:8866 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0430577Z [1669167103.411826] [08317a7e7676:8865 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0430804Z [1669167103.425608] [08317a7e7676:8865 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0431021Z [1669167103.425608] [08317a7e7676:8865 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0431259Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0431504Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0431606Z ok (6.630s) 2022-11-23T01:46:56.0431627Z 2022-11-23T01:46:56.0431901Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0432013Z Ran 1 test in 6.630s 2022-11-23T01:46:56.0432036Z 2022-11-23T01:46:56.0432129Z OK 2022-11-23T01:46:56.0432149Z 2022-11-23T01:46:56.0432273Z Generating XML reports... 2022-11-23T01:46:56.0432729Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013138.xml 2022-11-23T01:46:56.0433090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0433268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0433654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0433851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0433871Z 2022-11-23T01:46:56.0433979Z Running tests... 2022-11-23T01:46:56.0434243Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0434633Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0434928Z test_ddp_broadcast_buffer_via_hook (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0435131Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 8983 2022-11-23T01:46:56.0435350Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 8984 2022-11-23T01:46:56.0435728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0435906Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0436290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0436544Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0436921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0437096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0437476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0437648Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0437899Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0438148Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0438556Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0438961Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0439196Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0439427Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0439688Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq_5agunk 2022-11-23T01:46:56.0439959Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq_5agunk/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0440198Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3j6p0q2p 2022-11-23T01:46:56.0440467Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3j6p0q2p/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0440745Z [1669167112.610641] [08317a7e7676:8984 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0440984Z [1669167112.623848] [08317a7e7676:8984 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0441226Z [1669167112.623848] [08317a7e7676:8984 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0441500Z [1669167112.603752] [08317a7e7676:8983 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0441729Z [1669167112.617386] [08317a7e7676:8983 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0441967Z [1669167112.617386] [08317a7e7676:8983 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0442206Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0442450Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0442671Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0442951Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0443059Z ok (6.665s) 2022-11-23T01:46:56.0443078Z 2022-11-23T01:46:56.0443348Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0443458Z Ran 1 test in 6.665s 2022-11-23T01:46:56.0443477Z 2022-11-23T01:46:56.0443569Z OK 2022-11-23T01:46:56.0443588Z 2022-11-23T01:46:56.0443715Z Generating XML reports... 2022-11-23T01:46:56.0444166Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013147.xml 2022-11-23T01:46:56.0444526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0444756Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0445143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0445340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0445360Z 2022-11-23T01:46:56.0445470Z Running tests... 2022-11-23T01:46:56.0445732Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0446047Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0446325Z test_ddp_buffer_hook_allreduce (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0447081Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78641 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.603s) 2022-11-23T01:46:56.0447105Z 2022-11-23T01:46:56.0447369Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0447466Z Ran 1 test in 1.603s 2022-11-23T01:46:56.0447487Z 2022-11-23T01:46:56.0447595Z OK (skipped=1) 2022-11-23T01:46:56.0447614Z 2022-11-23T01:46:56.0447738Z Generating XML reports... 2022-11-23T01:46:56.0448183Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013156.xml 2022-11-23T01:46:56.0448557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0448735Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0449118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0449317Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0449337Z 2022-11-23T01:46:56.0449425Z Running tests... 2022-11-23T01:46:56.0449688Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0450008Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0450298Z test_ddp_buffer_hook_allreduce_return_future (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0451041Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77261 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.662s) 2022-11-23T01:46:56.0451062Z 2022-11-23T01:46:56.0451321Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0451436Z Ran 1 test in 1.663s 2022-11-23T01:46:56.0451456Z 2022-11-23T01:46:56.0451563Z OK (skipped=1) 2022-11-23T01:46:56.0451581Z 2022-11-23T01:46:56.0451705Z Generating XML reports... 2022-11-23T01:46:56.0452201Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013200.xml 2022-11-23T01:46:56.0452566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0452745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0453130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0453324Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0453343Z 2022-11-23T01:46:56.0453451Z Running tests... 2022-11-23T01:46:56.0453711Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0454077Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0454369Z test_ddp_build_debug_param_to_name_mapping (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0454575Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9169 2022-11-23T01:46:56.0454797Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9170 2022-11-23T01:46:56.0455169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0455345Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0455729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0455922Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0456294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0456468Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0456849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0457025Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0457272Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0457520Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0457925Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0458325Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0458560Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0458797Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0459061Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps6xzfs2q 2022-11-23T01:46:56.0459332Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps6xzfs2q/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0459570Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphli0m3gs 2022-11-23T01:46:56.0459842Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphli0m3gs/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0460123Z [1669167130.184409] [08317a7e7676:9170 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0460356Z [1669167130.197850] [08317a7e7676:9170 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0460600Z [1669167130.197850] [08317a7e7676:9170 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0460810Z 2022-11-23T01:46:56.0461130Z [1669167130.178473] [08317a7e7676:9169 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0461372Z [1669167130.192211] [08317a7e7676:9169 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0461607Z [1669167130.192211] [08317a7e7676:9169 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0461692Z ok (6.153s) 2022-11-23T01:46:56.0461729Z 2022-11-23T01:46:56.0461979Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0462094Z Ran 1 test in 6.153s 2022-11-23T01:46:56.0462159Z 2022-11-23T01:46:56.0462256Z OK 2022-11-23T01:46:56.0462274Z 2022-11-23T01:46:56.0462399Z Generating XML reports... 2022-11-23T01:46:56.0462849Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013204.xml 2022-11-23T01:46:56.0463229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0463410Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0463796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0463971Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0463991Z 2022-11-23T01:46:56.0464100Z Running tests... 2022-11-23T01:46:56.0464362Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0464680Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0464994Z test_ddp_build_debug_param_to_name_mapping_requires_grad (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0465214Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9283 2022-11-23T01:46:56.0465436Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9284 2022-11-23T01:46:56.0465810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0465969Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0466352Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0466541Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0466912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0467090Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0467475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0467671Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0467920Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0468171Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0468555Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0469170Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0469420Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0469661Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0469927Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx67gpjd0 2022-11-23T01:46:56.0470285Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx67gpjd0/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0470553Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmy2oeh3n 2022-11-23T01:46:56.0470825Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmy2oeh3n/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0471101Z [1669167138.846201] [08317a7e7676:9283 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0471318Z [1669167138.859978] [08317a7e7676:9283 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0471612Z [1669167138.859978] [08317a7e7676:9283 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0471888Z [1669167138.846178] [08317a7e7676:9284 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0472120Z [1669167138.859736] [08317a7e7676:9284 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0472356Z [1669167138.859736] [08317a7e7676:9284 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0472459Z ok (6.164s) 2022-11-23T01:46:56.0472480Z 2022-11-23T01:46:56.0472755Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0472869Z Ran 1 test in 6.165s 2022-11-23T01:46:56.0472889Z 2022-11-23T01:46:56.0472981Z OK 2022-11-23T01:46:56.0473000Z 2022-11-23T01:46:56.0473105Z Generating XML reports... 2022-11-23T01:46:56.0473560Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013213.xml 2022-11-23T01:46:56.0473934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0474114Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0474497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0474691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0474711Z 2022-11-23T01:46:56.0474818Z Running tests... 2022-11-23T01:46:56.0475082Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0475395Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0475645Z test_ddp_comm_hook_logging (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0475873Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9397 2022-11-23T01:46:56.0476092Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9398 2022-11-23T01:46:56.0476470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0476649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0477031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0477223Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0477593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0477752Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0478134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0478324Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0478615Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0478870Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0479279Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0479681Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0479916Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0480147Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0480435Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3geqma8s 2022-11-23T01:46:56.0480707Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3geqma8s/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0480970Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaf69pf4g 2022-11-23T01:46:56.0481240Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaf69pf4g/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0481478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0481712Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0481991Z [1669167147.582349] [08317a7e7676:9398 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0482221Z [1669167147.595661] [08317a7e7676:9398 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0482463Z [1669167147.595661] [08317a7e7676:9398 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0482722Z [1669167147.573393] [08317a7e7676:9397 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0482950Z [1669167147.587204] [08317a7e7676:9397 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0483190Z [1669167147.587204] [08317a7e7676:9397 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0483292Z ok (6.643s) 2022-11-23T01:46:56.0483312Z 2022-11-23T01:46:56.0483581Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0483694Z Ran 1 test in 6.643s 2022-11-23T01:46:56.0483713Z 2022-11-23T01:46:56.0483809Z OK 2022-11-23T01:46:56.0483828Z 2022-11-23T01:46:56.0483952Z Generating XML reports... 2022-11-23T01:46:56.0484402Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013222.xml 2022-11-23T01:46:56.0484761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0484942Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0485324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0485517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0485537Z 2022-11-23T01:46:56.0485643Z Running tests... 2022-11-23T01:46:56.0485905Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0486219Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0486518Z test_ddp_control_flow_different_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0486739Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9515 2022-11-23T01:46:56.0486986Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9516 2022-11-23T01:46:56.0487371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0487552Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0487936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0488126Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0488491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0488711Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0489090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0489266Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0489519Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0489768Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0490172Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0490573Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0490807Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0491036Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0491297Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp98oaguy5 2022-11-23T01:46:56.0491572Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp98oaguy5/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0491811Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3c1utvsd 2022-11-23T01:46:56.0492083Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3c1utvsd/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0492360Z [1669167156.753743] [08317a7e7676:9515 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0492592Z [1669167156.767154] [08317a7e7676:9515 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0492832Z [1669167156.767154] [08317a7e7676:9515 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0493109Z [1669167156.760686] [08317a7e7676:9516 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0493341Z [1669167156.773893] [08317a7e7676:9516 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0493580Z [1669167156.773893] [08317a7e7676:9516 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0494375Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T01:46:56.0494481Z ok (6.653s) 2022-11-23T01:46:56.0494500Z 2022-11-23T01:46:56.0494769Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0494926Z Ran 1 test in 6.654s 2022-11-23T01:46:56.0494948Z 2022-11-23T01:46:56.0495023Z OK 2022-11-23T01:46:56.0495043Z 2022-11-23T01:46:56.0495167Z Generating XML reports... 2022-11-23T01:46:56.0495628Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013231.xml 2022-11-23T01:46:56.0496007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0496183Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0496569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0496813Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0496833Z 2022-11-23T01:46:56.0496942Z Running tests... 2022-11-23T01:46:56.0497191Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0497512Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0497798Z test_ddp_control_flow_same_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0498551Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78235 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.624s) 2022-11-23T01:46:56.0498572Z 2022-11-23T01:46:56.0498829Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0498943Z Ran 1 test in 1.624s 2022-11-23T01:46:56.0498963Z 2022-11-23T01:46:56.0499069Z OK (skipped=1) 2022-11-23T01:46:56.0499088Z 2022-11-23T01:46:56.0499213Z Generating XML reports... 2022-11-23T01:46:56.0499664Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013240.xml 2022-11-23T01:46:56.0500046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0500205Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0500589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0500780Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0500800Z 2022-11-23T01:46:56.0500908Z Running tests... 2022-11-23T01:46:56.0501171Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0501489Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0501751Z test_ddp_create_graph (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0501972Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9667 2022-11-23T01:46:56.0502177Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9668 2022-11-23T01:46:56.0502553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0502729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0503113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0503305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0503675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0503852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0504276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0504475Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0504708Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0504956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0505360Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0505763Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0506047Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0506307Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa_0uqs_v 2022-11-23T01:46:56.0506585Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa_0uqs_v/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0506819Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0507079Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu7hjtqem 2022-11-23T01:46:56.0507334Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu7hjtqem/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0507610Z [1669167168.687807] [08317a7e7676:9667 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0507902Z [1669167170.129675] [08317a7e7676:9667 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0508147Z [1669167170.129675] [08317a7e7676:9667 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0509259Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:46:56.0509542Z [1669167168.689426] [08317a7e7676:9668 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0509771Z [1669167170.101334] [08317a7e7676:9668 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0510004Z [1669167170.101334] [08317a7e7676:9668 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0510907Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:46:56.0512121Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Using backward() with create_graph=True will create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using autograd.grad when creating the graph to avoid this. If you have to use this function, make sure to reset the .grad fields of your parameters to None after use to break the cycle and avoid the leak. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/engine.cpp:1127.) 2022-11-23T01:46:56.0512360Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T01:46:56.0513609Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Using backward() with create_graph=True will create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using autograd.grad when creating the graph to avoid this. If you have to use this function, make sure to reset the .grad fields of your parameters to None after use to break the cycle and avoid the leak. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/engine.cpp:1127.) 2022-11-23T01:46:56.0513856Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T01:46:56.0514100Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0514338Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0515314Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:46:56.0516213Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:46:56.0517102Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:46:56.0517990Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:46:56.0518875Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:46:56.0519766Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:46:56.0520649Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:46:56.0521531Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:46:56.0522461Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:46:56.0523349Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T01:46:56.0523456Z ok (6.143s) 2022-11-23T01:46:56.0523476Z 2022-11-23T01:46:56.0523792Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0523888Z Ran 1 test in 6.143s 2022-11-23T01:46:56.0523908Z 2022-11-23T01:46:56.0524001Z OK 2022-11-23T01:46:56.0524020Z 2022-11-23T01:46:56.0524145Z Generating XML reports... 2022-11-23T01:46:56.0524605Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013244.xml 2022-11-23T01:46:56.0524989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0525166Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0525555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0525748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0525768Z 2022-11-23T01:46:56.0525876Z Running tests... 2022-11-23T01:46:56.0526124Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0526444Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0526698Z test_ddp_device (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0527454Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77324 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.647s) 2022-11-23T01:46:56.0527475Z 2022-11-23T01:46:56.0527738Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0527849Z Ran 1 test in 1.647s 2022-11-23T01:46:56.0527869Z 2022-11-23T01:46:56.0527977Z OK (skipped=1) 2022-11-23T01:46:56.0527996Z 2022-11-23T01:46:56.0528121Z Generating XML reports... 2022-11-23T01:46:56.0528570Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013253.xml 2022-11-23T01:46:56.0528934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0529117Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0529501Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0529774Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0529795Z 2022-11-23T01:46:56.0529940Z Running tests... 2022-11-23T01:46:56.0530219Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0530686Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0531050Z test_ddp_forward_backward_hook (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0531315Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 9815 2022-11-23T01:46:56.0531517Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 9816 2022-11-23T01:46:56.0531987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0532218Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0532631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0532840Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0533260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0533534Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0533958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0534249Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0534485Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0534771Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0535218Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0535659Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0535928Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0536239Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0536576Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp739p7prt 2022-11-23T01:46:56.0536901Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp739p7prt/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0537201Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpskiw999a 2022-11-23T01:46:56.0537453Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpskiw999a/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0538295Z /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1319: UserWarning: Using a non-full backward hook when the forward contains multiple autograd Nodes is deprecated and will be removed in future versions. This hook will be missing some grad_input. Please use register_full_backward_hook to get the documented behavior. 2022-11-23T01:46:56.0538669Z warnings.warn("Using a non-full backward hook when the forward contains multiple autograd Nodes " 2022-11-23T01:46:56.0539504Z /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1319: UserWarning: Using a non-full backward hook when the forward contains multiple autograd Nodes is deprecated and will be removed in future versions. This hook will be missing some grad_input. Please use register_full_backward_hook to get the documented behavior. 2022-11-23T01:46:56.0539880Z warnings.warn("Using a non-full backward hook when the forward contains multiple autograd Nodes " 2022-11-23T01:46:56.0540191Z [1669167183.091263] [08317a7e7676:9816 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0540510Z [1669167183.104708] [08317a7e7676:9816 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0540787Z [1669167183.104708] [08317a7e7676:9816 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0541100Z [1669167183.086373] [08317a7e7676:9815 :0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0541366Z [1669167183.100419] [08317a7e7676:9815 :0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0541717Z [1669167183.100419] [08317a7e7676:9815 :0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0541811Z ok (6.753s) 2022-11-23T01:46:56.0541885Z 2022-11-23T01:46:56.0542146Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0542347Z Ran 1 test in 6.753s 2022-11-23T01:46:56.0542368Z 2022-11-23T01:46:56.0542505Z OK 2022-11-23T01:46:56.0542525Z 2022-11-23T01:46:56.0542721Z Generating XML reports... 2022-11-23T01:46:56.0543217Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013257.xml 2022-11-23T01:46:56.0543627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0543893Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0544315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0544497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0544517Z 2022-11-23T01:46:56.0544670Z Running tests... 2022-11-23T01:46:56.0544973Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0545325Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0545676Z test_ddp_grad_div_uneven_inputs (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0546477Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78685 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.643s) 2022-11-23T01:46:56.0546503Z 2022-11-23T01:46:56.0546800Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0546951Z Ran 1 test in 1.644s 2022-11-23T01:46:56.0546971Z 2022-11-23T01:46:56.0547122Z OK (skipped=1) 2022-11-23T01:46:56.0547141Z 2022-11-23T01:46:56.0547347Z Generating XML reports... 2022-11-23T01:46:56.0547783Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013307.xml 2022-11-23T01:46:56.0548197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0548411Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0548872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0549305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0549328Z 2022-11-23T01:46:56.0578669Z Running tests... 2022-11-23T01:46:56.0579018Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0579345Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0579620Z test_ddp_hook_parity_allreduce (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0580378Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77293 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.631s) 2022-11-23T01:46:56.0580401Z 2022-11-23T01:46:56.0580655Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0580762Z Ran 1 test in 1.631s 2022-11-23T01:46:56.0580782Z 2022-11-23T01:46:56.0580878Z OK (skipped=1) 2022-11-23T01:46:56.0580898Z 2022-11-23T01:46:56.0581011Z Generating XML reports... 2022-11-23T01:46:56.0581582Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013311.xml 2022-11-23T01:46:56.0581978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0582147Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0582518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0582702Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0582722Z 2022-11-23T01:46:56.0582818Z Running tests... 2022-11-23T01:46:56.0583069Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0583450Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0583736Z test_ddp_hook_parity_allreduce_process_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0583951Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10031 2022-11-23T01:46:56.0584164Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10032 2022-11-23T01:46:56.0584531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0584691Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0585065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0585247Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0585606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0585775Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0586151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0586333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0586570Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0586799Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0587196Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0587589Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0587815Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0588043Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.0588260Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0588486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.0588872Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0589487Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0589731Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5z_hs155 2022-11-23T01:46:56.0589993Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5z_hs155/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0590244Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgl78_kje 2022-11-23T01:46:56.0590503Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgl78_kje/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0590862Z [1669167200.630110] [08317a7e7676:10032:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0591097Z [1669167200.643460] [08317a7e7676:10032:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0591324Z [1669167200.643460] [08317a7e7676:10032:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0591590Z [1669167200.620867] [08317a7e7676:10031:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0591807Z [1669167200.634318] [08317a7e7676:10031:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0592090Z [1669167200.634318] [08317a7e7676:10031:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0592316Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0592545Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0592768Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0592990Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0593080Z ok (6.812s) 2022-11-23T01:46:56.0593101Z 2022-11-23T01:46:56.0593365Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0593466Z Ran 1 test in 6.812s 2022-11-23T01:46:56.0593486Z 2022-11-23T01:46:56.0593565Z OK 2022-11-23T01:46:56.0593588Z 2022-11-23T01:46:56.0593700Z Generating XML reports... 2022-11-23T01:46:56.0594135Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013315.xml 2022-11-23T01:46:56.0594506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0594673Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0595047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0595230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0595250Z 2022-11-23T01:46:56.0595346Z Running tests... 2022-11-23T01:46:56.0595601Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0595905Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0596176Z test_ddp_hook_parity_post_localSGD (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0596385Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10149 2022-11-23T01:46:56.0596599Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10150 2022-11-23T01:46:56.0596966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0597131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0597502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0597683Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0598039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0598203Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0598575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0598758Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0599045Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0599290Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0599685Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0600074Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0600295Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0600564Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T01:46:56.0600830Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0601090Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T01:46:56.0601341Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpryx0yxln 2022-11-23T01:46:56.0601603Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpryx0yxln/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0601847Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp961_ug6i 2022-11-23T01:46:56.0602104Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp961_ug6i/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0602369Z [1669167209.966516] [08317a7e7676:10150:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0602593Z [1669167209.980563] [08317a7e7676:10150:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0602823Z [1669167209.980563] [08317a7e7676:10150:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0603085Z [1669167209.959245] [08317a7e7676:10149:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0603299Z [1669167209.972230] [08317a7e7676:10149:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0603524Z [1669167209.972230] [08317a7e7676:10149:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0603753Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0603976Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0604201Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0604424Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0604696Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T01:46:56.0604968Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T01:46:56.0605234Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T01:46:56.0605495Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T01:46:56.0605721Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0605948Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0606171Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0606393Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0606704Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T01:46:56.0606976Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T01:46:56.0607244Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 1000 iterations 2022-11-23T01:46:56.0607503Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 1000 iterations 2022-11-23T01:46:56.0607719Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0607985Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0608210Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0608434Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0608531Z ok (7.218s) 2022-11-23T01:46:56.0608552Z 2022-11-23T01:46:56.0608826Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0608940Z Ran 1 test in 7.218s 2022-11-23T01:46:56.0608960Z 2022-11-23T01:46:56.0609052Z OK 2022-11-23T01:46:56.0609072Z 2022-11-23T01:46:56.0609178Z Generating XML reports... 2022-11-23T01:46:56.0609633Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013324.xml 2022-11-23T01:46:56.0610011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0610194Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0610583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0610779Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0610799Z 2022-11-23T01:46:56.0610909Z Running tests... 2022-11-23T01:46:56.0611221Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0611540Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0611796Z test_ddp_hook_parity_powerSGD (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0612550Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77378 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.629s) 2022-11-23T01:46:56.0612576Z 2022-11-23T01:46:56.0612837Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0612950Z Ran 1 test in 1.629s 2022-11-23T01:46:56.0612969Z 2022-11-23T01:46:56.0613081Z OK (skipped=1) 2022-11-23T01:46:56.0613100Z 2022-11-23T01:46:56.0613225Z Generating XML reports... 2022-11-23T01:46:56.0613678Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013334.xml 2022-11-23T01:46:56.0614056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0614236Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0614621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0614798Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0614822Z 2022-11-23T01:46:56.0614931Z Running tests... 2022-11-23T01:46:56.0615194Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0615558Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0615841Z test_ddp_hook_pickling_powerSGD (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0616066Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10301 2022-11-23T01:46:56.0616286Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10302 2022-11-23T01:46:56.0616662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0616821Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0617207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0617451Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0617827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0618006Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0618386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0618578Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0618830Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0619078Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0619470Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0619880Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0620117Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0620677Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 4; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T01:46:56.0620908Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0621461Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 4; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T01:46:56.0621726Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg9w3i64e 2022-11-23T01:46:56.0622001Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg9w3i64e/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0622261Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprq0rqpyc 2022-11-23T01:46:56.0622533Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprq0rqpyc/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0622773Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0622995Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0623276Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Start to apply PowerSGD after 4 iterations. 2022-11-23T01:46:56.0623557Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Start to apply PowerSGD after 4 iterations. 2022-11-23T01:46:56.0623860Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:A zero tensor of length 10 that represents local error is created. 2022-11-23T01:46:56.0624247Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Compression stats: iter 4, total before compression 10, total after compression 10, rate 1.0 2022-11-23T01:46:56.0624557Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:A zero tensor of length 10 that represents local error is created. 2022-11-23T01:46:56.0624886Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Allocating contiguous memory of length 0 for Ps, and of length 0 for Qs, respectively. 2022-11-23T01:46:56.0625219Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Compression stats: iter 4, total before compression 10, total after compression 10, rate 1.0 2022-11-23T01:46:56.0625548Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Allocating contiguous memory of length 0 for Ps, and of length 0 for Qs, respectively. 2022-11-23T01:46:56.0625873Z [1669167223.896100] [08317a7e7676:10301:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0626090Z [1669167223.909891] [08317a7e7676:10301:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0626328Z [1669167223.909891] [08317a7e7676:10301:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0626603Z [1669167223.899562] [08317a7e7676:10302:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0626833Z [1669167223.912927] [08317a7e7676:10302:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0627073Z [1669167223.912927] [08317a7e7676:10302:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0627316Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0627559Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0627664Z ok (6.676s) 2022-11-23T01:46:56.0627685Z 2022-11-23T01:46:56.0627958Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0628053Z Ran 1 test in 6.677s 2022-11-23T01:46:56.0628091Z 2022-11-23T01:46:56.0628164Z OK 2022-11-23T01:46:56.0628183Z 2022-11-23T01:46:56.0628308Z Generating XML reports... 2022-11-23T01:46:56.0628763Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013338.xml 2022-11-23T01:46:56.0629369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0629558Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0629945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0630144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0630165Z 2022-11-23T01:46:56.0630274Z Running tests... 2022-11-23T01:46:56.0630525Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0630841Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0631243Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:46:56.0631263Z 2022-11-23T01:46:56.0631527Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0631643Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0631663Z 2022-11-23T01:46:56.0631771Z OK (skipped=1) 2022-11-23T01:46:56.0631790Z 2022-11-23T01:46:56.0631915Z Generating XML reports... 2022-11-23T01:46:56.0632445Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013347.xml 2022-11-23T01:46:56.0632836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0632996Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0633385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0633580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0633600Z 2022-11-23T01:46:56.0633710Z Running tests... 2022-11-23T01:46:56.0633971Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0634347Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0634747Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:46:56.0634768Z 2022-11-23T01:46:56.0635029Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0635142Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0635161Z 2022-11-23T01:46:56.0635250Z OK (skipped=1) 2022-11-23T01:46:56.0635269Z 2022-11-23T01:46:56.0635396Z Generating XML reports... 2022-11-23T01:46:56.0635847Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013350.xml 2022-11-23T01:46:56.0636222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0636405Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0636792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0636986Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0637009Z 2022-11-23T01:46:56.0637121Z Running tests... 2022-11-23T01:46:56.0637383Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0637678Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0638134Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:46:56.0638155Z 2022-11-23T01:46:56.0638413Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0638531Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0638550Z 2022-11-23T01:46:56.0638659Z OK (skipped=1) 2022-11-23T01:46:56.0638678Z 2022-11-23T01:46:56.0638802Z Generating XML reports... 2022-11-23T01:46:56.0639249Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013352.xml 2022-11-23T01:46:56.0639620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0639799Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0640167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0640362Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0640382Z 2022-11-23T01:46:56.0640490Z Running tests... 2022-11-23T01:46:56.0640756Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0641075Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0641579Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:46:56.0641602Z 2022-11-23T01:46:56.0641867Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0641981Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0642000Z 2022-11-23T01:46:56.0642108Z OK (skipped=1) 2022-11-23T01:46:56.0642127Z 2022-11-23T01:46:56.0642232Z Generating XML reports... 2022-11-23T01:46:56.0642681Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013355.xml 2022-11-23T01:46:56.0643055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0643280Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0643665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0643862Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0643883Z 2022-11-23T01:46:56.0643993Z Running tests... 2022-11-23T01:46:56.0644255Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0644567Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0645002Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:46:56.0645041Z 2022-11-23T01:46:56.0645280Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0645396Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0645415Z 2022-11-23T01:46:56.0645525Z OK (skipped=1) 2022-11-23T01:46:56.0645544Z 2022-11-23T01:46:56.0645668Z Generating XML reports... 2022-11-23T01:46:56.0646120Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013357.xml 2022-11-23T01:46:56.0646494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0646673Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0647058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0647235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0647275Z 2022-11-23T01:46:56.0647365Z Running tests... 2022-11-23T01:46:56.0647628Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0647946Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0648401Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:46:56.0648422Z 2022-11-23T01:46:56.0648684Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0648796Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0648816Z 2022-11-23T01:46:56.0648923Z OK (skipped=1) 2022-11-23T01:46:56.0648943Z 2022-11-23T01:46:56.0649067Z Generating XML reports... 2022-11-23T01:46:56.0649519Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013359.xml 2022-11-23T01:46:56.0649877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0650060Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0650446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0650702Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0650724Z 2022-11-23T01:46:56.0650834Z Running tests... 2022-11-23T01:46:56.0651097Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0651411Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0651866Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:46:56.0651886Z 2022-11-23T01:46:56.0652211Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0652304Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0652324Z 2022-11-23T01:46:56.0652432Z OK (skipped=1) 2022-11-23T01:46:56.0652451Z 2022-11-23T01:46:56.0652574Z Generating XML reports... 2022-11-23T01:46:56.0653023Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013402.xml 2022-11-23T01:46:56.0653401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0653580Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0653963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0654159Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0654179Z 2022-11-23T01:46:56.0654287Z Running tests... 2022-11-23T01:46:56.0654537Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0654852Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0655307Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:46:56.0655328Z 2022-11-23T01:46:56.0655594Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0655706Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0655726Z 2022-11-23T01:46:56.0655833Z OK (skipped=1) 2022-11-23T01:46:56.0655852Z 2022-11-23T01:46:56.0655976Z Generating XML reports... 2022-11-23T01:46:56.0656425Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013404.xml 2022-11-23T01:46:56.0656800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0656964Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0657355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0657550Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0657570Z 2022-11-23T01:46:56.0657678Z Running tests... 2022-11-23T01:46:56.0657942Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0658257Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0658706Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:46:56.0658730Z 2022-11-23T01:46:56.0658995Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0659107Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0659126Z 2022-11-23T01:46:56.0659216Z OK (skipped=1) 2022-11-23T01:46:56.0659257Z 2022-11-23T01:46:56.0659411Z Generating XML reports... 2022-11-23T01:46:56.0659874Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013407.xml 2022-11-23T01:46:56.0660250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0660428Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0660812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0661006Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0661025Z 2022-11-23T01:46:56.0661178Z Running tests... 2022-11-23T01:46:56.0661445Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0661738Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0662190Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:46:56.0662211Z 2022-11-23T01:46:56.0662472Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0662584Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0662603Z 2022-11-23T01:46:56.0662710Z OK (skipped=1) 2022-11-23T01:46:56.0662729Z 2022-11-23T01:46:56.0662853Z Generating XML reports... 2022-11-23T01:46:56.0663304Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013409.xml 2022-11-23T01:46:56.0663682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0663861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0664232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0664428Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0664447Z 2022-11-23T01:46:56.0664556Z Running tests... 2022-11-23T01:46:56.0664816Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0665130Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0665525Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:46:56.0665548Z 2022-11-23T01:46:56.0665802Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0665915Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0665935Z 2022-11-23T01:46:56.0666043Z OK (skipped=1) 2022-11-23T01:46:56.0666062Z 2022-11-23T01:46:56.0666170Z Generating XML reports... 2022-11-23T01:46:56.0666617Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013412.xml 2022-11-23T01:46:56.0666991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0667169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0667552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0667746Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0667765Z 2022-11-23T01:46:56.0667876Z Running tests... 2022-11-23T01:46:56.0668137Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0668453Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0668870Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T01:46:56.0668910Z 2022-11-23T01:46:56.0669390Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0669504Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0669524Z 2022-11-23T01:46:56.0669634Z OK (skipped=1) 2022-11-23T01:46:56.0669654Z 2022-11-23T01:46:56.0669778Z Generating XML reports... 2022-11-23T01:46:56.0670230Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013414.xml 2022-11-23T01:46:56.0670605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0670912Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0671301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0671479Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0671518Z 2022-11-23T01:46:56.0671607Z Running tests... 2022-11-23T01:46:56.0671870Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0672185Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0672456Z test_ddp_ignore_params_arg (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0673206Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77325 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.689s) 2022-11-23T01:46:56.0673230Z 2022-11-23T01:46:56.0673492Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0673609Z Ran 1 test in 1.689s 2022-11-23T01:46:56.0673629Z 2022-11-23T01:46:56.0673737Z OK (skipped=1) 2022-11-23T01:46:56.0673757Z 2022-11-23T01:46:56.0673882Z Generating XML reports... 2022-11-23T01:46:56.0674312Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013417.xml 2022-11-23T01:46:56.0674689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0674866Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0675254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0675453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0675473Z 2022-11-23T01:46:56.0675586Z Running tests... 2022-11-23T01:46:56.0675847Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0676165Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0676405Z test_ddp_inference (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0676629Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10849 2022-11-23T01:46:56.0676854Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10850 2022-11-23T01:46:56.0677227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0677404Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0677794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0677988Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0678425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0678608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0678975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0679165Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0679417Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0679666Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0680074Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0680527Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0680767Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0681001Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0681264Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe2gvqkcb 2022-11-23T01:46:56.0681524Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe2gvqkcb/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0681782Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9xzc6mgj 2022-11-23T01:46:56.0682057Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9xzc6mgj/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0682340Z [1669167266.655196] [08317a7e7676:10849:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0682575Z [1669167266.668861] [08317a7e7676:10849:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0682815Z [1669167266.668861] [08317a7e7676:10849:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0683090Z [1669167266.664321] [08317a7e7676:10850:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0683320Z [1669167266.677673] [08317a7e7676:10850:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0683559Z [1669167266.677673] [08317a7e7676:10850:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0683647Z ok (6.977s) 2022-11-23T01:46:56.0683685Z 2022-11-23T01:46:56.0683940Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0684052Z Ran 1 test in 6.977s 2022-11-23T01:46:56.0684072Z 2022-11-23T01:46:56.0684165Z OK 2022-11-23T01:46:56.0684184Z 2022-11-23T01:46:56.0684312Z Generating XML reports... 2022-11-23T01:46:56.0684763Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013421.xml 2022-11-23T01:46:56.0685142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0685322Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0685705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0685883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0685906Z 2022-11-23T01:46:56.0686016Z Running tests... 2022-11-23T01:46:56.0686281Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0686598Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0686921Z test_ddp_join_model_equivalence (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0687148Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 10963 2022-11-23T01:46:56.0687370Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 10964 2022-11-23T01:46:56.0687744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0687905Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0688292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0688531Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0688909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0689089Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0689472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0689664Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0689913Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0690163Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0690553Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0690962Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0691197Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0691431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0691695Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkgjlb911 2022-11-23T01:46:56.0691968Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkgjlb911/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0692226Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnw3znk0y 2022-11-23T01:46:56.0692502Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnw3znk0y/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0692745Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0692966Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0693248Z [1669167276.539724] [08317a7e7676:10963:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0693484Z [1669167276.552951] [08317a7e7676:10963:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0693725Z [1669167276.552951] [08317a7e7676:10963:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0693998Z [1669167276.545107] [08317a7e7676:10964:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0694229Z [1669167276.558002] [08317a7e7676:10964:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0694466Z [1669167276.558002] [08317a7e7676:10964:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0694883Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-11-23T01:46:56.0695052Z _warnings.warn(warn_message, ResourceWarning) 2022-11-23T01:46:56.0695483Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-11-23T01:46:56.0695655Z _warnings.warn(warn_message, ResourceWarning) 2022-11-23T01:46:56.0695756Z ok (6.625s) 2022-11-23T01:46:56.0695776Z 2022-11-23T01:46:56.0696050Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0696166Z Ran 1 test in 6.625s 2022-11-23T01:46:56.0696186Z 2022-11-23T01:46:56.0696279Z OK 2022-11-23T01:46:56.0696299Z 2022-11-23T01:46:56.0696424Z Generating XML reports... 2022-11-23T01:46:56.0696875Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013430.xml 2022-11-23T01:46:56.0697297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0697457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0697850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0698045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0698065Z 2022-11-23T01:46:56.0698174Z Running tests... 2022-11-23T01:46:56.0698439Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0698757Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0699029Z test_ddp_logging_data_cpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0699252Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11081 2022-11-23T01:46:56.0699459Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11082 2022-11-23T01:46:56.0699837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0700017Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0700403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0700597Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0700969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0701145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0701527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0701724Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0701954Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0702205Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0702615Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0703021Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0703256Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0703516Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcl0ijccv 2022-11-23T01:46:56.0703790Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcl0ijccv/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0704027Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0704286Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8nduz7ws 2022-11-23T01:46:56.0704587Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8nduz7ws/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0704833Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0705070Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0705347Z [1669167283.938153] [08317a7e7676:11082:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0705580Z [1669167285.342695] [08317a7e7676:11082:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0705821Z [1669167285.342695] [08317a7e7676:11082:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0706141Z [1669167283.916927] [08317a7e7676:11081:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0706376Z [1669167285.327848] [08317a7e7676:11081:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0706616Z [1669167285.327848] [08317a7e7676:11081:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0706700Z ok (6.269s) 2022-11-23T01:46:56.0706737Z 2022-11-23T01:46:56.0706993Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0707104Z Ran 1 test in 6.269s 2022-11-23T01:46:56.0707124Z 2022-11-23T01:46:56.0707218Z OK 2022-11-23T01:46:56.0707237Z 2022-11-23T01:46:56.0707363Z Generating XML reports... 2022-11-23T01:46:56.0707817Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013439.xml 2022-11-23T01:46:56.0708200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0708384Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0708772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0709145Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0709168Z 2022-11-23T01:46:56.0709283Z Running tests... 2022-11-23T01:46:56.0709550Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0709868Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0710137Z test_ddp_logging_data_gpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0710366Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11225 2022-11-23T01:46:56.0710588Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11226 2022-11-23T01:46:56.0710967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0711157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0711543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0711734Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0712111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0712289Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0712674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0712870Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0713121Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0713459Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0713861Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0714259Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0714494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0714725Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0714989Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpng0tklsm 2022-11-23T01:46:56.0715329Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpng0tklsm/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0715587Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8m8t6uow 2022-11-23T01:46:56.0715860Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8m8t6uow/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0716139Z [1669167294.178836] [08317a7e7676:11226:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0716353Z [1669167294.192177] [08317a7e7676:11226:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0716593Z [1669167294.192177] [08317a7e7676:11226:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0716868Z [1669167294.171639] [08317a7e7676:11225:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0717103Z [1669167294.185579] [08317a7e7676:11225:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0717340Z [1669167294.185579] [08317a7e7676:11225:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0717584Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0717825Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0717929Z ok (6.662s) 2022-11-23T01:46:56.0717949Z 2022-11-23T01:46:56.0718222Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0718318Z Ran 1 test in 6.663s 2022-11-23T01:46:56.0718359Z 2022-11-23T01:46:56.0718433Z OK 2022-11-23T01:46:56.0718451Z 2022-11-23T01:46:56.0718581Z Generating XML reports... 2022-11-23T01:46:56.0719037Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013448.xml 2022-11-23T01:46:56.0719419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0719600Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0719986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0720180Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0720200Z 2022-11-23T01:46:56.0720310Z Running tests... 2022-11-23T01:46:56.0720557Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0720871Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0721162Z test_ddp_model_diff_num_params_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0721389Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11343 2022-11-23T01:46:56.0721611Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11344 2022-11-23T01:46:56.0722034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0722215Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0722599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0722774Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0723145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0723320Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0723750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0723943Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0724196Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0724445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0724852Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0725253Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0725468Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0725699Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0725946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.0726190Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.0726593Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0726990Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0727237Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T01:46:56.0727480Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T01:46:56.0727873Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:46:56.0728251Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:46:56.0728514Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5ca22x5a 2022-11-23T01:46:56.0728790Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5ca22x5a/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0729049Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbnw2hsgr 2022-11-23T01:46:56.0729324Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbnw2hsgr/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0729600Z [1669167303.370721] [08317a7e7676:11343:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0729833Z [1669167303.384335] [08317a7e7676:11343:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0730077Z [1669167303.384335] [08317a7e7676:11343:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0730411Z [1669167303.373499] [08317a7e7676:11344:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0730647Z [1669167303.386813] [08317a7e7676:11344:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0730865Z [1669167303.386813] [08317a7e7676:11344:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0730969Z ok (6.256s) 2022-11-23T01:46:56.0730989Z 2022-11-23T01:46:56.0731259Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0731373Z Ran 1 test in 6.256s 2022-11-23T01:46:56.0731392Z 2022-11-23T01:46:56.0731484Z OK 2022-11-23T01:46:56.0731503Z 2022-11-23T01:46:56.0731628Z Generating XML reports... 2022-11-23T01:46:56.0732135Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013458.xml 2022-11-23T01:46:56.0732513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0732697Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0733067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0733262Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0733282Z 2022-11-23T01:46:56.0733391Z Running tests... 2022-11-23T01:46:56.0733658Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0733977Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0734265Z test_ddp_model_diff_shape_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0734492Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11463 2022-11-23T01:46:56.0734712Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11464 2022-11-23T01:46:56.0735070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0735248Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0735634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0735826Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0736201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0736378Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0736768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0736963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0737215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0737445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0737852Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0738255Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0738489Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0738722Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0738971Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.0739215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.0739661Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0740066Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0740290Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T01:46:56.0740530Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T01:46:56.0740924Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:46:56.0741317Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:46:56.0741628Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpth0gyymf 2022-11-23T01:46:56.0741908Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpth0gyymf/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0742167Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp18pdwhej 2022-11-23T01:46:56.0742438Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp18pdwhej/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0742714Z [1669167312.279210] [08317a7e7676:11464:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0742928Z [1669167312.292713] [08317a7e7676:11464:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0743167Z [1669167312.292713] [08317a7e7676:11464:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0743563Z [1669167322.663863] [08317a7e7676:11464:0] tag_match.c:62 UCX WARN unexpected tag-receive descriptor 0x564a7604ab40 was not matched 2022-11-23T01:46:56.0743842Z [1669167312.278091] [08317a7e7676:11463:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0744074Z [1669167312.292122] [08317a7e7676:11463:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0744309Z [1669167312.292122] [08317a7e7676:11463:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0744625Z [1669167322.632693] [08317a7e7676:11463:1] ucc_schedule.h:189 UCC WARN timeout 10 sec. has expired on req 0x55cd66165480, seq_num 3, TL_UCP, team_id 1, size 2, rank 0, ctx_rank 0: Barrier n/a inplace=0 bytes=0 2022-11-23T01:46:56.0744904Z [1669167322.663800] [08317a7e7676:11463:0] mpool.c:55 UCX WARN object 0x55cd66276a00 {flags:0x20040 recv length 0 host memory} was not returned to mpool ucp_requests 2022-11-23T01:46:56.0745010Z ok (16.384s) 2022-11-23T01:46:56.0745031Z 2022-11-23T01:46:56.0745304Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0745400Z Ran 1 test in 16.384s 2022-11-23T01:46:56.0745439Z 2022-11-23T01:46:56.0745513Z OK 2022-11-23T01:46:56.0745532Z 2022-11-23T01:46:56.0745657Z Generating XML reports... 2022-11-23T01:46:56.0746110Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013506.xml 2022-11-23T01:46:56.0746486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0746666Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0747052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0747251Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0747271Z 2022-11-23T01:46:56.0747380Z Running tests... 2022-11-23T01:46:56.0747674Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0748000Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0748312Z test_ddp_multiple_nested_unused_params_err_ignore_params (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0748539Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11583 2022-11-23T01:46:56.0748762Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11584 2022-11-23T01:46:56.0749428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0749688Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0750083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0750280Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0750637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0750813Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0751198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0751393Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0751643Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0751891Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0752301Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0752707Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0752925Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0753156Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0753418Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc92yesvc 2022-11-23T01:46:56.0753690Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc92yesvc/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0753950Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5fcp8a1q 2022-11-23T01:46:56.0754220Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5fcp8a1q/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0754501Z [1669167331.114184] [08317a7e7676:11583:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0754737Z [1669167331.128000] [08317a7e7676:11583:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0754977Z [1669167331.128000] [08317a7e7676:11583:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0755249Z [1669167331.122928] [08317a7e7676:11584:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0755462Z [1669167331.136300] [08317a7e7676:11584:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0755700Z [1669167331.136300] [08317a7e7676:11584:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0755806Z ok (7.069s) 2022-11-23T01:46:56.0755826Z 2022-11-23T01:46:56.0756099Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0756212Z Ran 1 test in 7.069s 2022-11-23T01:46:56.0756294Z 2022-11-23T01:46:56.0756392Z OK 2022-11-23T01:46:56.0756412Z 2022-11-23T01:46:56.0756537Z Generating XML reports... 2022-11-23T01:46:56.0756995Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013525.xml 2022-11-23T01:46:56.0757375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0757535Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0757920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0758162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0758182Z 2022-11-23T01:46:56.0758291Z Running tests... 2022-11-23T01:46:56.0758561Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0758880Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0759177Z test_ddp_multiple_nested_unused_params_error (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0759401Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11701 2022-11-23T01:46:56.0759605Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11702 2022-11-23T01:46:56.0759987Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0760164Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0760547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0760744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0761118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0761296Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0761677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0761872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0762101Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0762349Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0762753Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0763158Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0763396Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0763630Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0763893Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu55gp597 2022-11-23T01:46:56.0764164Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu55gp597/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0764422Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps5cczpky 2022-11-23T01:46:56.0764675Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps5cczpky/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0764951Z [1669167340.809262] [08317a7e7676:11702:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0765189Z [1669167340.822736] [08317a7e7676:11702:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0765475Z [1669167340.822736] [08317a7e7676:11702:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0765755Z [1669167340.803742] [08317a7e7676:11701:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0765985Z [1669167340.817484] [08317a7e7676:11701:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0766224Z [1669167340.817484] [08317a7e7676:11701:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0766327Z ok (7.068s) 2022-11-23T01:46:56.0766391Z 2022-11-23T01:46:56.0766665Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0766759Z Ran 1 test in 7.068s 2022-11-23T01:46:56.0766797Z 2022-11-23T01:46:56.0766872Z OK 2022-11-23T01:46:56.0766891Z 2022-11-23T01:46:56.0767015Z Generating XML reports... 2022-11-23T01:46:56.0767467Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013535.xml 2022-11-23T01:46:56.0767847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0768029Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0768418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0768614Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0768634Z 2022-11-23T01:46:56.0768747Z Running tests... 2022-11-23T01:46:56.0768992Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0769310Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0769578Z test_ddp_namedtuple (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0769802Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11819 2022-11-23T01:46:56.0770024Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11820 2022-11-23T01:46:56.0770401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0770578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0770963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0771138Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0771516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0771693Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0772080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0772273Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0772525Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0772773Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0773177Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0773582Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0773801Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0774033Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0774351Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk70b88fl 2022-11-23T01:46:56.0774629Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk70b88fl/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0774884Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4soe2mtv 2022-11-23T01:46:56.0775158Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4soe2mtv/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0775437Z [1669167350.446657] [08317a7e7676:11819:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0775714Z [1669167350.460617] [08317a7e7676:11819:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0775952Z [1669167350.460617] [08317a7e7676:11819:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0776213Z [1669167350.449564] [08317a7e7676:11820:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0776444Z [1669167350.463076] [08317a7e7676:11820:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0776681Z [1669167350.463076] [08317a7e7676:11820:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0776784Z ok (6.654s) 2022-11-23T01:46:56.0776804Z 2022-11-23T01:46:56.0777073Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0777189Z Ran 1 test in 6.654s 2022-11-23T01:46:56.0777212Z 2022-11-23T01:46:56.0777305Z OK 2022-11-23T01:46:56.0777325Z 2022-11-23T01:46:56.0777450Z Generating XML reports... 2022-11-23T01:46:56.0777899Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013545.xml 2022-11-23T01:46:56.0778263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0778444Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0778829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0779023Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0779044Z 2022-11-23T01:46:56.0779153Z Running tests... 2022-11-23T01:46:56.0779418Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0779734Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0780008Z test_ddp_new_tensor_in_fwd (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0780233Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 11933 2022-11-23T01:46:56.0780440Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 11934 2022-11-23T01:46:56.0780821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0780999Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0781386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0781580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0781959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0782139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0782521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0782742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0782997Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0783246Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0783652Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0784055Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0784291Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0784571Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0784834Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjuwu4gel 2022-11-23T01:46:56.0785112Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjuwu4gel/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0785350Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkyvqfru3 2022-11-23T01:46:56.0785621Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkyvqfru3/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0785898Z [1669167359.588271] [08317a7e7676:11934:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0786134Z [1669167359.601503] [08317a7e7676:11934:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0786378Z [1669167359.601503] [08317a7e7676:11934:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0787181Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T01:46:56.0787459Z [1669167359.582013] [08317a7e7676:11933:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0787690Z [1669167359.595761] [08317a7e7676:11933:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0787929Z [1669167359.595761] [08317a7e7676:11933:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0788727Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T01:46:56.0789181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0789431Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0789535Z ok (6.611s) 2022-11-23T01:46:56.0789561Z 2022-11-23T01:46:56.0789837Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0789933Z Ran 1 test in 6.611s 2022-11-23T01:46:56.0789952Z 2022-11-23T01:46:56.0790045Z OK 2022-11-23T01:46:56.0790064Z 2022-11-23T01:46:56.0790261Z Generating XML reports... 2022-11-23T01:46:56.0790730Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013554.xml 2022-11-23T01:46:56.0791108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0791288Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0791674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0791868Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0791888Z 2022-11-23T01:46:56.0792040Z Running tests... 2022-11-23T01:46:56.0792310Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0792628Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0792918Z test_ddp_new_tensor_in_fwd_static_graph (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0793675Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78338 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.654s) 2022-11-23T01:46:56.0793696Z 2022-11-23T01:46:56.0793961Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0794075Z Ran 1 test in 1.654s 2022-11-23T01:46:56.0794094Z 2022-11-23T01:46:56.0794202Z OK (skipped=1) 2022-11-23T01:46:56.0794222Z 2022-11-23T01:46:56.0794351Z Generating XML reports... 2022-11-23T01:46:56.0794803Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013603.xml 2022-11-23T01:46:56.0795164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0795344Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0795731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0795927Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0795946Z 2022-11-23T01:46:56.0796055Z Running tests... 2022-11-23T01:46:56.0796322Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0796639Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0796926Z test_ddp_profiling_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0797688Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77342 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.649s) 2022-11-23T01:46:56.0797709Z 2022-11-23T01:46:56.0797952Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0798066Z Ran 1 test in 1.650s 2022-11-23T01:46:56.0798085Z 2022-11-23T01:46:56.0798193Z OK (skipped=1) 2022-11-23T01:46:56.0798212Z 2022-11-23T01:46:56.0798336Z Generating XML reports... 2022-11-23T01:46:56.0798785Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013607.xml 2022-11-23T01:46:56.0799160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0799343Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0799734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0799981Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0800004Z 2022-11-23T01:46:56.0800096Z Running tests... 2022-11-23T01:46:56.0800356Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0800671Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0800951Z test_ddp_profiling_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0801176Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12119 2022-11-23T01:46:56.0801398Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12120 2022-11-23T01:46:56.0801822Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0801999Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0802369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0802564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0802935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0803113Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0803497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0803690Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0803947Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0804197Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0804609Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0804995Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0805230Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0805460Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0805724Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdcupt1w1 2022-11-23T01:46:56.0806001Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdcupt1w1/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0806264Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpysuj15x8 2022-11-23T01:46:56.0806537Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpysuj15x8/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0806818Z [1669167377.194950] [08317a7e7676:12120:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0807051Z [1669167377.208093] [08317a7e7676:12120:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0807272Z [1669167377.208093] [08317a7e7676:12120:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0807619Z STAGE:2022-11-23 01:36:17 12120:12120 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0807957Z [1669167377.186326] [08317a7e7676:12119:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0808195Z [1669167377.200198] [08317a7e7676:12119:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0808528Z [1669167377.200198] [08317a7e7676:12119:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0808879Z STAGE:2022-11-23 01:36:17 12119:12119 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0809120Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0809358Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0809701Z STAGE:2022-11-23 01:36:18 12120:12120 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0810036Z STAGE:2022-11-23 01:36:18 12119:12119 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0810425Z STAGE:2022-11-23 01:36:18 12119:12119 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0810774Z STAGE:2022-11-23 01:36:18 12120:12120 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0811609Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T01:46:56.0812407Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T01:46:56.0812750Z STAGE:2022-11-23 01:36:18 12120:12120 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0813075Z STAGE:2022-11-23 01:36:18 12119:12119 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0813413Z STAGE:2022-11-23 01:36:18 12120:12120 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0813762Z STAGE:2022-11-23 01:36:18 12120:12120 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0814100Z STAGE:2022-11-23 01:36:18 12119:12119 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0814455Z STAGE:2022-11-23 01:36:18 12119:12119 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0814560Z ok (7.276s) 2022-11-23T01:46:56.0814580Z 2022-11-23T01:46:56.0814852Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0814951Z Ran 1 test in 7.277s 2022-11-23T01:46:56.0814971Z 2022-11-23T01:46:56.0815064Z OK 2022-11-23T01:46:56.0815083Z 2022-11-23T01:46:56.0815212Z Generating XML reports... 2022-11-23T01:46:56.0815668Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013611.xml 2022-11-23T01:46:56.0816048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0816228Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0816613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0816813Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0816833Z 2022-11-23T01:46:56.0816941Z Running tests... 2022-11-23T01:46:56.0817186Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0817549Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0817828Z test_ddp_python_error_logged (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0818053Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12241 2022-11-23T01:46:56.0818272Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12242 2022-11-23T01:46:56.0818650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0818828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0819263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0819438Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0819814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0819991Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0820378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0820573Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0820825Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0821074Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0821481Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0821892Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0822111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0822343Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0822607Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf14u7hfq 2022-11-23T01:46:56.0822883Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf14u7hfq/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0823144Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpio7muyph 2022-11-23T01:46:56.0823418Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpio7muyph/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0823699Z [1669167386.984301] [08317a7e7676:12241:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0823937Z [1669167386.998163] [08317a7e7676:12241:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0824179Z [1669167386.998163] [08317a7e7676:12241:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0824433Z [1669167386.984795] [08317a7e7676:12242:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0824664Z [1669167386.998312] [08317a7e7676:12242:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0824902Z [1669167386.998312] [08317a7e7676:12242:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0825010Z ok (6.159s) 2022-11-23T01:46:56.0825030Z 2022-11-23T01:46:56.0825300Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0825414Z Ran 1 test in 6.160s 2022-11-23T01:46:56.0825433Z 2022-11-23T01:46:56.0825524Z OK 2022-11-23T01:46:56.0825543Z 2022-11-23T01:46:56.0825714Z Generating XML reports... 2022-11-23T01:46:56.0826175Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013621.xml 2022-11-23T01:46:56.0826539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0826720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0827105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0827302Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0827321Z 2022-11-23T01:46:56.0827476Z Running tests... 2022-11-23T01:46:56.0827744Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0828059Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0828346Z test_ddp_returns_tensor_with_no_grad (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0829318Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78595 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.643s) 2022-11-23T01:46:56.0829342Z 2022-11-23T01:46:56.0829611Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0829707Z Ran 1 test in 1.643s 2022-11-23T01:46:56.0829727Z 2022-11-23T01:46:56.0829837Z OK (skipped=1) 2022-11-23T01:46:56.0829861Z 2022-11-23T01:46:56.0829984Z Generating XML reports... 2022-11-23T01:46:56.0830438Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013630.xml 2022-11-23T01:46:56.0830817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0830995Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0831380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0831575Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0831595Z 2022-11-23T01:46:56.0831703Z Running tests... 2022-11-23T01:46:56.0831948Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0832264Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0832554Z test_ddp_shared_grad_acc_unused_params (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0832778Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12389 2022-11-23T01:46:56.0833001Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12390 2022-11-23T01:46:56.0833379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0833558Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0833945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0834121Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0834493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0834669Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0835059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0835252Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0835587Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0835845Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0836256Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0836657Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0836873Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0837106Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0837430Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgt3bf7au 2022-11-23T01:46:56.0837704Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgt3bf7au/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0837965Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpomi31x60 2022-11-23T01:46:56.0838235Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpomi31x60/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0839163Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T01:46:56.0839281Z warnings.warn( 2022-11-23T01:46:56.0840211Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T01:46:56.0840326Z warnings.warn( 2022-11-23T01:46:56.0840551Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0840791Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0841069Z [1669167399.968519] [08317a7e7676:12389:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0841302Z [1669167399.981512] [08317a7e7676:12389:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0841547Z [1669167399.981512] [08317a7e7676:12389:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0841823Z [1669167399.972038] [08317a7e7676:12390:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0842055Z [1669167399.985079] [08317a7e7676:12390:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0842293Z [1669167399.985079] [08317a7e7676:12390:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0842397Z ok (6.663s) 2022-11-23T01:46:56.0842418Z 2022-11-23T01:46:56.0842690Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0842784Z Ran 1 test in 6.663s 2022-11-23T01:46:56.0842804Z 2022-11-23T01:46:56.0842897Z OK 2022-11-23T01:46:56.0842916Z 2022-11-23T01:46:56.0843045Z Generating XML reports... 2022-11-23T01:46:56.0843501Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013634.xml 2022-11-23T01:46:56.0843926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0844108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0844498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0844696Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0844715Z 2022-11-23T01:46:56.0844825Z Running tests... 2022-11-23T01:46:56.0845072Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0845388Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0845670Z test_ddp_static_graph_nested_types (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0846482Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77625 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.655s) 2022-11-23T01:46:56.0846503Z 2022-11-23T01:46:56.0846767Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0846881Z Ran 1 test in 1.655s 2022-11-23T01:46:56.0846900Z 2022-11-23T01:46:56.0847010Z OK (skipped=1) 2022-11-23T01:46:56.0847029Z 2022-11-23T01:46:56.0847156Z Generating XML reports... 2022-11-23T01:46:56.0847607Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013643.xml 2022-11-23T01:46:56.0847965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0848150Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0848538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0848736Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0848756Z 2022-11-23T01:46:56.0848866Z Running tests... 2022-11-23T01:46:56.0849126Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0849444Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0849722Z test_ddp_sync_bn_training_vs_eval (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0849946Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12541 2022-11-23T01:46:56.0850150Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12542 2022-11-23T01:46:56.0850532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0850710Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0851100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0851294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0851665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0851843Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0852226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0852400Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0852653Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0852902Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0853353Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0853762Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0853997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0854230Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0854494Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4zbux672 2022-11-23T01:46:56.0854766Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4zbux672/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0855055Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpja8h_e1y 2022-11-23T01:46:56.0855323Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpja8h_e1y/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0855604Z [1669167413.342091] [08317a7e7676:12542:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0855841Z [1669167413.356380] [08317a7e7676:12542:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0856079Z [1669167413.356380] [08317a7e7676:12542:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0856424Z STAGE:2022-11-23 01:36:54 12542:12542 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0856700Z [1669167413.341935] [08317a7e7676:12541:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0856935Z [1669167413.356391] [08317a7e7676:12541:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0857176Z [1669167413.356391] [08317a7e7676:12541:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0857516Z STAGE:2022-11-23 01:36:54 12541:12541 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0857740Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0857981Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T01:46:56.0858324Z STAGE:2022-11-23 01:36:54 12542:12542 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0858662Z STAGE:2022-11-23 01:36:54 12541:12541 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0859015Z STAGE:2022-11-23 01:36:54 12541:12541 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0859366Z STAGE:2022-11-23 01:36:54 12542:12542 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0859701Z STAGE:2022-11-23 01:36:54 12541:12541 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.0860033Z STAGE:2022-11-23 01:36:54 12541:12541 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.0860390Z STAGE:2022-11-23 01:36:54 12541:12541 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.0860476Z ok (7.475s) 2022-11-23T01:46:56.0860496Z 2022-11-23T01:46:56.0860765Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0860884Z Ran 1 test in 7.475s 2022-11-23T01:46:56.0860903Z 2022-11-23T01:46:56.0860997Z OK 2022-11-23T01:46:56.0861016Z 2022-11-23T01:46:56.0861141Z Generating XML reports... 2022-11-23T01:46:56.0861599Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013648.xml 2022-11-23T01:46:56.0861979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0862206Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0862583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0862778Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0862798Z 2022-11-23T01:46:56.0862907Z Running tests... 2022-11-23T01:46:56.0863175Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0863490Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0863762Z test_ddp_sync_module_states (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0864036Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12663 2022-11-23T01:46:56.0864258Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12664 2022-11-23T01:46:56.0864643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0864803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0865189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0865384Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0865757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0865934Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0866325Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0866518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0866772Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0867002Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0867411Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0867814Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0868050Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0868278Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0868544Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa0f3h0bb 2022-11-23T01:46:56.0868817Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa0f3h0bb/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0869283Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphzmh6sra 2022-11-23T01:46:56.0869560Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphzmh6sra/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0869818Z [1669167423.386400] [08317a7e7676:12663:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0870057Z [1669167423.400268] [08317a7e7676:12663:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0870295Z [1669167423.400268] [08317a7e7676:12663:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0870576Z [1669167423.392440] [08317a7e7676:12664:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0870882Z [1669167423.406134] [08317a7e7676:12664:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0871129Z [1669167423.406134] [08317a7e7676:12664:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0871233Z ok (6.136s) 2022-11-23T01:46:56.0871254Z 2022-11-23T01:46:56.0871528Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0871644Z Ran 1 test in 6.137s 2022-11-23T01:46:56.0871663Z 2022-11-23T01:46:56.0871737Z OK 2022-11-23T01:46:56.0871776Z 2022-11-23T01:46:56.0871883Z Generating XML reports... 2022-11-23T01:46:56.0872334Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013658.xml 2022-11-23T01:46:56.0872778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0872957Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0873347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0873542Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0873561Z 2022-11-23T01:46:56.0873672Z Running tests... 2022-11-23T01:46:56.0873938Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0874237Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0874517Z test_ddp_uneven_input_exception (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0874742Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12777 2022-11-23T01:46:56.0874971Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12778 2022-11-23T01:46:56.0875346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0875528Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0875916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0876113Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0876488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0876648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0877032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0877230Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0877480Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0877729Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0878139Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0878545Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0878779Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0878993Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0879255Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppxe_o4k0 2022-11-23T01:46:56.0879530Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppxe_o4k0/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0879787Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq5xtqoe1 2022-11-23T01:46:56.0880106Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq5xtqoe1/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0880387Z [1669167432.020691] [08317a7e7676:12777:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0880623Z [1669167432.033901] [08317a7e7676:12777:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0880863Z [1669167432.033901] [08317a7e7676:12777:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0881139Z [1669167432.028942] [08317a7e7676:12778:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0881423Z [1669167432.042255] [08317a7e7676:12778:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0881647Z [1669167432.042255] [08317a7e7676:12778:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0881750Z ok (6.070s) 2022-11-23T01:46:56.0881770Z 2022-11-23T01:46:56.0882038Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0882151Z Ran 1 test in 6.070s 2022-11-23T01:46:56.0882171Z 2022-11-23T01:46:56.0882265Z OK 2022-11-23T01:46:56.0882284Z 2022-11-23T01:46:56.0882410Z Generating XML reports... 2022-11-23T01:46:56.0882864Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013706.xml 2022-11-23T01:46:56.0883242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0883425Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0883793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0883990Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0884011Z 2022-11-23T01:46:56.0884121Z Running tests... 2022-11-23T01:46:56.0884386Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0884703Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0884982Z test_ddp_uneven_input_join_disable (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0885735Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78684 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.601s) 2022-11-23T01:46:56.0885759Z 2022-11-23T01:46:56.0886025Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0886138Z Ran 1 test in 1.601s 2022-11-23T01:46:56.0886157Z 2022-11-23T01:46:56.0886249Z OK (skipped=1) 2022-11-23T01:46:56.0886291Z 2022-11-23T01:46:56.0886397Z Generating XML reports... 2022-11-23T01:46:56.0886848Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013715.xml 2022-11-23T01:46:56.0887226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0887405Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0887793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0887986Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0888009Z 2022-11-23T01:46:56.0888118Z Running tests... 2022-11-23T01:46:56.0888382Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0888727Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0888998Z test_ddp_uneven_inputs (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0889750Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/75648 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.638s) 2022-11-23T01:46:56.0889772Z 2022-11-23T01:46:56.0890034Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0890147Z Ran 1 test in 1.639s 2022-11-23T01:46:56.0890209Z 2022-11-23T01:46:56.0890318Z OK (skipped=1) 2022-11-23T01:46:56.0890337Z 2022-11-23T01:46:56.0890462Z Generating XML reports... 2022-11-23T01:46:56.0890916Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013719.xml 2022-11-23T01:46:56.0891298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0891477Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0891847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0892042Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0892062Z 2022-11-23T01:46:56.0892171Z Running tests... 2022-11-23T01:46:56.0892434Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0892750Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0893047Z test_ddp_uneven_inputs_stop_iteration_sync_bn (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0893796Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78113 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.648s) 2022-11-23T01:46:56.0893818Z 2022-11-23T01:46:56.0894083Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0894198Z Ran 1 test in 1.648s 2022-11-23T01:46:56.0894218Z 2022-11-23T01:46:56.0894306Z OK (skipped=1) 2022-11-23T01:46:56.0894344Z 2022-11-23T01:46:56.0894450Z Generating XML reports... 2022-11-23T01:46:56.0894899Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013723.xml 2022-11-23T01:46:56.0895280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0895459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0895847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0896045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0896064Z 2022-11-23T01:46:56.0896172Z Running tests... 2022-11-23T01:46:56.0896433Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0896730Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0897031Z test_ddp_unused_params_rebuild_buckets_exception (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0897255Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 12993 2022-11-23T01:46:56.0897482Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 12994 2022-11-23T01:46:56.0897857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0898097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0898493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0898688Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0899043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0899221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0899602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0899844Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0900093Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0900344Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0900753Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0901158Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0901393Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0901609Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0901871Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuayaygb8 2022-11-23T01:46:56.0902150Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuayaygb8/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0902407Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_heov29v 2022-11-23T01:46:56.0902678Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_heov29v/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0902955Z [1669167453.167001] [08317a7e7676:12994:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0903191Z [1669167453.180100] [08317a7e7676:12994:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0903428Z [1669167453.180100] [08317a7e7676:12994:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0903706Z [1669167453.162384] [08317a7e7676:12993:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0903939Z [1669167453.176343] [08317a7e7676:12993:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0904160Z [1669167453.176343] [08317a7e7676:12993:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0904264Z ok (6.671s) 2022-11-23T01:46:56.0904284Z 2022-11-23T01:46:56.0904555Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0904669Z Ran 1 test in 6.671s 2022-11-23T01:46:56.0904688Z 2022-11-23T01:46:56.0904779Z OK 2022-11-23T01:46:56.0904798Z 2022-11-23T01:46:56.0904922Z Generating XML reports... 2022-11-23T01:46:56.0905371Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013727.xml 2022-11-23T01:46:56.0905749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0905916Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0906305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0906550Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0906571Z 2022-11-23T01:46:56.0906681Z Running tests... 2022-11-23T01:46:56.0906945Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0907262Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0907537Z test_ddp_zero_output_features (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0907760Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13111 2022-11-23T01:46:56.0907982Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13112 2022-11-23T01:46:56.0908393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0908570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0909161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0909366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0909742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0909920Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0910329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0910545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0910778Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0911031Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0911479Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0911884Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0912120Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0912506Z /opt/conda/lib/python3.10/site-packages/torch/nn/init.py:405: UserWarning: Initializing zero-element tensors is a no-op 2022-11-23T01:46:56.0912761Z warnings.warn("Initializing zero-element tensors is a no-op") 2022-11-23T01:46:56.0912995Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0913379Z /opt/conda/lib/python3.10/site-packages/torch/nn/init.py:405: UserWarning: Initializing zero-element tensors is a no-op 2022-11-23T01:46:56.0913621Z warnings.warn("Initializing zero-element tensors is a no-op") 2022-11-23T01:46:56.0913884Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwz7qp5cc 2022-11-23T01:46:56.0914161Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwz7qp5cc/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0914418Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphaytqtjy 2022-11-23T01:46:56.0914689Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphaytqtjy/_remote_module_non_scriptable.py 2022-11-23T01:46:56.0914963Z [1669167462.423098] [08317a7e7676:13112:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0915198Z [1669167462.436569] [08317a7e7676:13112:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0915441Z [1669167462.436569] [08317a7e7676:13112:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0915795Z [1669167462.419353] [08317a7e7676:13111:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.0916032Z [1669167462.433192] [08317a7e7676:13111:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.0916253Z [1669167462.433192] [08317a7e7676:13111:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.0916356Z ok (6.179s) 2022-11-23T01:46:56.0916376Z 2022-11-23T01:46:56.0916650Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0916763Z Ran 1 test in 6.179s 2022-11-23T01:46:56.0916783Z 2022-11-23T01:46:56.0916875Z OK 2022-11-23T01:46:56.0916894Z 2022-11-23T01:46:56.0917077Z Generating XML reports... 2022-11-23T01:46:56.0917527Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013737.xml 2022-11-23T01:46:56.0917910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0918070Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0918453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0918646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0918666Z 2022-11-23T01:46:56.0918775Z Running tests... 2022-11-23T01:46:56.0919039Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0919357Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0919624Z test_destroy_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0919850Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13225 2022-11-23T01:46:56.0920072Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13226 2022-11-23T01:46:56.0920434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0920612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0920998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0921191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0921567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0921744Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0922136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0922329Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0922561Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0922810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0923216Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0923617Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0923853Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0924097Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.0924328Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0924567Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.0925015Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0925418Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0925503Z ok (4.448s) 2022-11-23T01:46:56.0925523Z 2022-11-23T01:46:56.0925787Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0925901Z Ran 1 test in 4.448s 2022-11-23T01:46:56.0925921Z 2022-11-23T01:46:56.0926013Z OK 2022-11-23T01:46:56.0926032Z 2022-11-23T01:46:56.0926158Z Generating XML reports... 2022-11-23T01:46:56.0926609Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013745.xml 2022-11-23T01:46:56.0927037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0927219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0927587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0927784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0927804Z 2022-11-23T01:46:56.0927913Z Running tests... 2022-11-23T01:46:56.0928177Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0928495Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0928755Z test_destroy_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0928982Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13328 2022-11-23T01:46:56.0929202Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13329 2022-11-23T01:46:56.0929563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0929742Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0930128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0930322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0930693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0930871Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0931257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0931455Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0931703Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0931934Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0932339Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0932742Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0932976Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0933219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.0933447Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0933688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.0934133Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0934538Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0934623Z ok (4.408s) 2022-11-23T01:46:56.0934643Z 2022-11-23T01:46:56.0934910Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0935026Z Ran 1 test in 4.408s 2022-11-23T01:46:56.0935046Z 2022-11-23T01:46:56.0935139Z OK 2022-11-23T01:46:56.0935158Z 2022-11-23T01:46:56.0935283Z Generating XML reports... 2022-11-23T01:46:56.0935735Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013752.xml 2022-11-23T01:46:56.0936162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0936339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0936725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0936901Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0936921Z 2022-11-23T01:46:56.0937031Z Running tests... 2022-11-23T01:46:56.0937298Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0937615Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0937896Z test_detect_ddp_is_actually_static (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0938650Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78767 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.671s) 2022-11-23T01:46:56.0938674Z 2022-11-23T01:46:56.0938945Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0939059Z Ran 1 test in 1.671s 2022-11-23T01:46:56.0939078Z 2022-11-23T01:46:56.0939187Z OK (skipped=1) 2022-11-23T01:46:56.0939206Z 2022-11-23T01:46:56.0939311Z Generating XML reports... 2022-11-23T01:46:56.0939764Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013759.xml 2022-11-23T01:46:56.0940140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0940318Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0940707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0940905Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0940925Z 2022-11-23T01:46:56.0941035Z Running tests... 2022-11-23T01:46:56.0941305Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0941627Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0941903Z test_different_graph_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0942656Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78748 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.643s) 2022-11-23T01:46:56.0942677Z 2022-11-23T01:46:56.0942946Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0943059Z Ran 1 test in 1.644s 2022-11-23T01:46:56.0943079Z 2022-11-23T01:46:56.0943186Z OK (skipped=1) 2022-11-23T01:46:56.0943205Z 2022-11-23T01:46:56.0943330Z Generating XML reports... 2022-11-23T01:46:56.0943823Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013804.xml 2022-11-23T01:46:56.0944205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0944383Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0944767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0944945Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0944965Z 2022-11-23T01:46:56.0945076Z Running tests... 2022-11-23T01:46:56.0945440Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0945755Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0946036Z test_dump_DDP_relevant_env_vars (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0946260Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13499 2022-11-23T01:46:56.0946483Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13500 2022-11-23T01:46:56.0946861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0947019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0947407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0947602Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0947982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0948159Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0948544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0948738Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0949211Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0949471Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0949861Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0950258Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0950497Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0950730Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0950836Z ok (4.213s) 2022-11-23T01:46:56.0950856Z 2022-11-23T01:46:56.0951123Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0951237Z Ran 1 test in 4.213s 2022-11-23T01:46:56.0951256Z 2022-11-23T01:46:56.0951351Z OK 2022-11-23T01:46:56.0951370Z 2022-11-23T01:46:56.0951475Z Generating XML reports... 2022-11-23T01:46:56.0951925Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013808.xml 2022-11-23T01:46:56.0952300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0952478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0952866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0953060Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0953079Z 2022-11-23T01:46:56.0953259Z Running tests... 2022-11-23T01:46:56.0953539Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0953858Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0954101Z test_gather (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:46:56.0954121Z 2022-11-23T01:46:56.0954384Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0954497Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0954517Z 2022-11-23T01:46:56.0954626Z OK (skipped=1) 2022-11-23T01:46:56.0954645Z 2022-11-23T01:46:56.0954827Z Generating XML reports... 2022-11-23T01:46:56.0955276Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013814.xml 2022-11-23T01:46:56.0955656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0955836Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0956219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0956392Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0956412Z 2022-11-23T01:46:56.0956521Z Running tests... 2022-11-23T01:46:56.0956786Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0957099Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0957371Z test_gather_checks (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:46:56.0957394Z 2022-11-23T01:46:56.0957654Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0957766Z Ran 1 test in 0.003s 2022-11-23T01:46:56.0957786Z 2022-11-23T01:46:56.0957900Z OK (skipped=1) 2022-11-23T01:46:56.0957919Z 2022-11-23T01:46:56.0958024Z Generating XML reports... 2022-11-23T01:46:56.0958477Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013817.xml 2022-11-23T01:46:56.0958855Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0959033Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0959420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0959615Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0959637Z 2022-11-23T01:46:56.0959745Z Running tests... 2022-11-23T01:46:56.0960009Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0960328Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0960569Z test_gather_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA gather (0.002s) 2022-11-23T01:46:56.0960589Z 2022-11-23T01:46:56.0960854Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0960966Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0960985Z 2022-11-23T01:46:56.0961093Z OK (skipped=1) 2022-11-23T01:46:56.0961112Z 2022-11-23T01:46:56.0961235Z Generating XML reports... 2022-11-23T01:46:56.0961682Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013819.xml 2022-11-23T01:46:56.0962057Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0962239Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0962679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0962860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0962880Z 2022-11-23T01:46:56.0962987Z Running tests... 2022-11-23T01:46:56.0963250Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0963564Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0963839Z test_gather_full_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:46:56.0963859Z 2022-11-23T01:46:56.0964118Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0964277Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0964297Z 2022-11-23T01:46:56.0964406Z OK (skipped=1) 2022-11-23T01:46:56.0964425Z 2022-11-23T01:46:56.0964530Z Generating XML reports... 2022-11-23T01:46:56.0964984Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013822.xml 2022-11-23T01:46:56.0965356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0965532Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0965916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0966110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0966130Z 2022-11-23T01:46:56.0966238Z Running tests... 2022-11-23T01:46:56.0966501Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0966820Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0967070Z test_gather_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:46:56.0967109Z 2022-11-23T01:46:56.0967357Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0967468Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0967488Z 2022-11-23T01:46:56.0967593Z OK (skipped=1) 2022-11-23T01:46:56.0967612Z 2022-11-23T01:46:56.0967737Z Generating XML reports... 2022-11-23T01:46:56.0968188Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013824.xml 2022-11-23T01:46:56.0968562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0968739Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0969123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0969298Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0969317Z 2022-11-23T01:46:56.0969425Z Running tests... 2022-11-23T01:46:56.0969689Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0970004Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0970271Z test_gather_object (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:46:56.0970291Z 2022-11-23T01:46:56.0970552Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0970663Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0970683Z 2022-11-23T01:46:56.0970791Z OK (skipped=1) 2022-11-23T01:46:56.0970810Z 2022-11-23T01:46:56.0970933Z Generating XML reports... 2022-11-23T01:46:56.0971370Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013826.xml 2022-11-23T01:46:56.0971747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0971974Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0972362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0972555Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0972575Z 2022-11-23T01:46:56.0972684Z Running tests... 2022-11-23T01:46:56.0972949Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0973264Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0973525Z test_gather_object_subgroup (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:46:56.0973609Z 2022-11-23T01:46:56.0973857Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0973967Z Ran 1 test in 0.002s 2022-11-23T01:46:56.0973986Z 2022-11-23T01:46:56.0974094Z OK (skipped=1) 2022-11-23T01:46:56.0974113Z 2022-11-23T01:46:56.0974240Z Generating XML reports... 2022-11-23T01:46:56.0974689Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013829.xml 2022-11-23T01:46:56.0975064Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0975243Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0975627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0975804Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0975844Z 2022-11-23T01:46:56.0975935Z Running tests... 2022-11-23T01:46:56.0976199Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0976512Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0976768Z test_get_backend (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0976991Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13833 2022-11-23T01:46:56.0977215Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13834 2022-11-23T01:46:56.0977588Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0977746Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0978130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0978326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0978696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0978876Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0979259Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0979451Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0979699Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0979949Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0980334Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0980742Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0980978Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0981249Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0981495Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.0981740Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.0982143Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0982541Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.0982643Z ok (4.283s) 2022-11-23T01:46:56.0982663Z 2022-11-23T01:46:56.0982956Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0983068Z Ran 1 test in 4.283s 2022-11-23T01:46:56.0983087Z 2022-11-23T01:46:56.0983179Z OK 2022-11-23T01:46:56.0983199Z 2022-11-23T01:46:56.0983322Z Generating XML reports... 2022-11-23T01:46:56.0983778Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013831.xml 2022-11-23T01:46:56.0984158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0984335Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0984721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0984913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0984933Z 2022-11-23T01:46:56.0985023Z Running tests... 2022-11-23T01:46:56.0985295Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0985610Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0985893Z test_get_future (__main__.TestDistBackendWithSpawn) ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T01:46:56.0985913Z 2022-11-23T01:46:56.0986173Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0986285Z Ran 1 test in 0.003s 2022-11-23T01:46:56.0986305Z 2022-11-23T01:46:56.0986413Z OK (skipped=1) 2022-11-23T01:46:56.0986432Z 2022-11-23T01:46:56.0986557Z Generating XML reports... 2022-11-23T01:46:56.0986986Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013838.xml 2022-11-23T01:46:56.0987359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0987537Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0987924Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0988116Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0988139Z 2022-11-23T01:46:56.0988249Z Running tests... 2022-11-23T01:46:56.0988510Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0988827Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0989283Z test_get_rank (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0989495Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 13969 2022-11-23T01:46:56.0989717Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 13970 2022-11-23T01:46:56.0990090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0990271Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0990656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0990914Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0991298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0991476Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0991842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0992037Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0992289Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.0992593Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.0993001Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0993408Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.0993643Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.0993872Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.0993975Z ok (4.369s) 2022-11-23T01:46:56.0993995Z 2022-11-23T01:46:56.0994241Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0994354Z Ran 1 test in 4.369s 2022-11-23T01:46:56.0994374Z 2022-11-23T01:46:56.0994466Z OK 2022-11-23T01:46:56.0994485Z 2022-11-23T01:46:56.0994609Z Generating XML reports... 2022-11-23T01:46:56.0995068Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013841.xml 2022-11-23T01:46:56.0995444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0995626Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0996015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0996207Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0996227Z 2022-11-23T01:46:56.0996317Z Running tests... 2022-11-23T01:46:56.0996582Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.0996895Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.0997165Z test_get_rank_size_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.0997391Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14072 2022-11-23T01:46:56.0997610Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14073 2022-11-23T01:46:56.0997994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0998175Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0998539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0998731Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.0999104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.0999280Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.0999668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.0999860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1000158Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1000408Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1000813Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1001196Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1001431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1001675Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.1001947Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1002184Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.1002588Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1002985Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1003089Z ok (4.359s) 2022-11-23T01:46:56.1003109Z 2022-11-23T01:46:56.1003375Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1003471Z Ran 1 test in 4.359s 2022-11-23T01:46:56.1003490Z 2022-11-23T01:46:56.1003583Z OK 2022-11-23T01:46:56.1003602Z 2022-11-23T01:46:56.1003726Z Generating XML reports... 2022-11-23T01:46:56.1004179Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013847.xml 2022-11-23T01:46:56.1004559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1004740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1005128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1005323Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1005343Z 2022-11-23T01:46:56.1005433Z Running tests... 2022-11-23T01:46:56.1005698Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1006013Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1006278Z test_get_rank_size_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1006505Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14175 2022-11-23T01:46:56.1006727Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14176 2022-11-23T01:46:56.1007106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1007285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1007668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1007843Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1008214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1008393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1008776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1008972Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1009224Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1009519Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1009931Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1010315Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1010550Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1010793Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.1011021Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1011346Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.1011751Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1012150Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1012253Z ok (4.358s) 2022-11-23T01:46:56.1012273Z 2022-11-23T01:46:56.1012541Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1012635Z Ran 1 test in 4.358s 2022-11-23T01:46:56.1012674Z 2022-11-23T01:46:56.1012748Z OK 2022-11-23T01:46:56.1012767Z 2022-11-23T01:46:56.1012892Z Generating XML reports... 2022-11-23T01:46:56.1013349Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013854.xml 2022-11-23T01:46:56.1013731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1013910Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1014302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1014497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1014517Z 2022-11-23T01:46:56.1014627Z Running tests... 2022-11-23T01:46:56.1014875Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1015192Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1015462Z test_invalid_static_graph (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1015686Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14278 2022-11-23T01:46:56.1015912Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14279 2022-11-23T01:46:56.1016291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1016471Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1016856Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1017029Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1017404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1017581Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1017965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1018160Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1018416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1018664Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1019125Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1019533Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1019751Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1019978Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1020241Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqjuoc94h 2022-11-23T01:46:56.1020513Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqjuoc94h/_remote_module_non_scriptable.py 2022-11-23T01:46:56.1020817Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppleg6iou 2022-11-23T01:46:56.1021093Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppleg6iou/_remote_module_non_scriptable.py 2022-11-23T01:46:56.1021371Z [1669167547.101097] [08317a7e7676:14278:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1021606Z [1669167547.114663] [08317a7e7676:14278:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1021846Z [1669167547.114663] [08317a7e7676:14278:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1022102Z [1669167547.101120] [08317a7e7676:14279:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1022338Z [1669167547.114392] [08317a7e7676:14279:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1022579Z [1669167547.114392] [08317a7e7676:14279:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1022684Z ok (6.617s) 2022-11-23T01:46:56.1022703Z 2022-11-23T01:46:56.1022975Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1023090Z Ran 1 test in 6.617s 2022-11-23T01:46:56.1023110Z 2022-11-23T01:46:56.1023203Z OK 2022-11-23T01:46:56.1023222Z 2022-11-23T01:46:56.1023347Z Generating XML reports... 2022-11-23T01:46:56.1023801Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013901.xml 2022-11-23T01:46:56.1024162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1024348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1024733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1024929Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1024951Z 2022-11-23T01:46:56.1025061Z Running tests... 2022-11-23T01:46:56.1025329Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1025648Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1025893Z test_irecv (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1026117Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14396 2022-11-23T01:46:56.1026320Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14397 2022-11-23T01:46:56.1026698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1026881Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1027265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1027518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1027896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1028078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1028490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1028668Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1028917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1029464Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1029874Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1030278Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1030515Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1030748Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1031025Z [1669167554.904985] [08317a7e7676:14396:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1031260Z [1669167556.356741] [08317a7e7676:14396:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1031503Z [1669167556.356741] [08317a7e7676:14396:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1031761Z [1669167554.926296] [08317a7e7676:14397:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1031993Z [1669167556.324836] [08317a7e7676:14397:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1032232Z [1669167556.324836] [08317a7e7676:14397:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1032334Z ok (6.185s) 2022-11-23T01:46:56.1032354Z 2022-11-23T01:46:56.1032624Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1032736Z Ran 1 test in 6.186s 2022-11-23T01:46:56.1032756Z 2022-11-23T01:46:56.1032847Z OK 2022-11-23T01:46:56.1032866Z 2022-11-23T01:46:56.1032997Z Generating XML reports... 2022-11-23T01:46:56.1033447Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013910.xml 2022-11-23T01:46:56.1033812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1033991Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1034377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1034571Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1034591Z 2022-11-23T01:46:56.1034700Z Running tests... 2022-11-23T01:46:56.1034966Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1035281Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1035524Z test_isend (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1035733Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14506 2022-11-23T01:46:56.1035955Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14507 2022-11-23T01:46:56.1036395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1036578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1036964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1037157Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1037531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1037708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1038146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1038322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1038577Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1038826Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1039233Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1039639Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1039871Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1040103Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1040385Z [1669167563.619000] [08317a7e7676:14506:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1040625Z [1669167565.028134] [08317a7e7676:14506:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1040847Z [1669167565.028134] [08317a7e7676:14506:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1041122Z [1669167563.639496] [08317a7e7676:14507:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1041352Z [1669167565.060623] [08317a7e7676:14507:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1041587Z [1669167565.060623] [08317a7e7676:14507:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1041693Z ok (6.172s) 2022-11-23T01:46:56.1041712Z 2022-11-23T01:46:56.1041986Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1042101Z Ran 1 test in 6.172s 2022-11-23T01:46:56.1042120Z 2022-11-23T01:46:56.1042212Z OK 2022-11-23T01:46:56.1042231Z 2022-11-23T01:46:56.1042360Z Generating XML reports... 2022-11-23T01:46:56.1042792Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013919.xml 2022-11-23T01:46:56.1043169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1043348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1043733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1043927Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1043950Z 2022-11-23T01:46:56.1044059Z Running tests... 2022-11-23T01:46:56.1044323Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1044638Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1044958Z test_isend_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1045169Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14616 2022-11-23T01:46:56.1045390Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14617 2022-11-23T01:46:56.1045767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1045943Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1046328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1046568Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1046943Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1047125Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1047493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1047686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1047933Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1048184Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1048589Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1048994Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1049224Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1049567Z STAGE:2022-11-23 01:39:32 14617:14617 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1049801Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1050119Z STAGE:2022-11-23 01:39:32 14616:14616 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1050398Z [1669167572.395053] [08317a7e7676:14617:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1050631Z [1669167574.029211] [08317a7e7676:14617:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1050870Z [1669167574.029211] [08317a7e7676:14617:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1051218Z STAGE:2022-11-23 01:39:34 14617:14617 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1051574Z STAGE:2022-11-23 01:39:34 14617:14617 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1051850Z [1669167572.393443] [08317a7e7676:14616:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1052083Z [1669167574.009694] [08317a7e7676:14616:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1052324Z [1669167574.009694] [08317a7e7676:14616:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1052666Z STAGE:2022-11-23 01:39:34 14616:14616 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1053001Z STAGE:2022-11-23 01:39:34 14616:14616 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1053105Z ok (6.667s) 2022-11-23T01:46:56.1053125Z 2022-11-23T01:46:56.1053393Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1053554Z Ran 1 test in 6.667s 2022-11-23T01:46:56.1053577Z 2022-11-23T01:46:56.1053670Z OK 2022-11-23T01:46:56.1053688Z 2022-11-23T01:46:56.1053815Z Generating XML reports... 2022-11-23T01:46:56.1054270Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013928.xml 2022-11-23T01:46:56.1054647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1054808Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1055194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1055437Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1055457Z 2022-11-23T01:46:56.1055567Z Running tests... 2022-11-23T01:46:56.1055832Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1056152Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1056419Z test_isend_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1056645Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14730 2022-11-23T01:46:56.1056866Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14731 2022-11-23T01:46:56.1057221Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1057398Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1057783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1057976Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1058349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1058526Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1058910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1059100Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1059329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1059577Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1059981Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1060387Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1060623Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1060962Z STAGE:2022-11-23 01:39:41 14731:14731 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1061194Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1061532Z STAGE:2022-11-23 01:39:41 14730:14730 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1061811Z [1669167581.733015] [08317a7e7676:14731:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1062045Z [1669167583.344749] [08317a7e7676:14731:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1062271Z [1669167583.344749] [08317a7e7676:14731:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1062667Z STAGE:2022-11-23 01:39:43 14731:14731 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1063029Z STAGE:2022-11-23 01:39:43 14731:14731 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1063302Z [1669167581.711869] [08317a7e7676:14730:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1063531Z [1669167583.361017] [08317a7e7676:14730:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1063764Z [1669167583.361017] [08317a7e7676:14730:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1064150Z STAGE:2022-11-23 01:39:43 14730:14730 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1064501Z STAGE:2022-11-23 01:39:43 14730:14730 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1064605Z ok (6.769s) 2022-11-23T01:46:56.1064629Z 2022-11-23T01:46:56.1064877Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1064992Z Ran 1 test in 6.769s 2022-11-23T01:46:56.1065011Z 2022-11-23T01:46:56.1065105Z OK 2022-11-23T01:46:56.1065124Z 2022-11-23T01:46:56.1065248Z Generating XML reports... 2022-11-23T01:46:56.1065698Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013937.xml 2022-11-23T01:46:56.1066077Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1066254Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1066643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1066837Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1066857Z 2022-11-23T01:46:56.1066950Z Running tests... 2022-11-23T01:46:56.1067218Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1067534Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1067826Z test_monitored_barrier_allreduce_hang (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1068049Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14844 2022-11-23T01:46:56.1068268Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14845 2022-11-23T01:46:56.1068648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1068830Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1069428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1069624Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1070000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1070179Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1070564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1070757Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1071006Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1071260Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1071665Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1072118Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1072361Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1072603Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.1072830Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1073069Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.1073470Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1073928Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1074176Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T01:46:56.1074426Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T01:46:56.1074801Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:46:56.1075195Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:46:56.1075431Z [E ProcessGroupGloo.cpp:137] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 100 ms 2022-11-23T01:46:56.1075536Z ok (22.462s) 2022-11-23T01:46:56.1075556Z 2022-11-23T01:46:56.1075826Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1075945Z Ran 1 test in 22.462s 2022-11-23T01:46:56.1075965Z 2022-11-23T01:46:56.1076058Z OK 2022-11-23T01:46:56.1076077Z 2022-11-23T01:46:56.1076201Z Generating XML reports... 2022-11-23T01:46:56.1076637Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013947.xml 2022-11-23T01:46:56.1077017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1077197Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1077582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1077775Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1077795Z 2022-11-23T01:46:56.1077905Z Running tests... 2022-11-23T01:46:56.1078170Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1078493Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1078800Z test_monitored_barrier_allreduce_hang_wait_all_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1079008Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 14965 2022-11-23T01:46:56.1079230Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 14966 2022-11-23T01:46:56.1079608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1079785Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1080172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1080367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1080746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1080923Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1081354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1081536Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1081784Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1082034Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1082438Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1082837Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1083120Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1083364Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.1083595Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1083834Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.1084216Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1084610Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1084855Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T01:46:56.1085103Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T01:46:56.1085496Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:46:56.1085894Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:46:56.1086131Z [E ProcessGroupGloo.cpp:2802] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 100 ms 2022-11-23T01:46:56.1086364Z [E ProcessGroupGloo.cpp:137] [Rank 0]: Ranks 1 failed to pass monitoredBarrier in 100 ms 2022-11-23T01:46:56.1086468Z ok (22.285s) 2022-11-23T01:46:56.1086488Z 2022-11-23T01:46:56.1086739Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1086855Z Ran 1 test in 22.286s 2022-11-23T01:46:56.1086874Z 2022-11-23T01:46:56.1086966Z OK 2022-11-23T01:46:56.1086985Z 2022-11-23T01:46:56.1087110Z Generating XML reports... 2022-11-23T01:46:56.1087564Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014012.xml 2022-11-23T01:46:56.1087946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1088126Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1088513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1088688Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1088726Z 2022-11-23T01:46:56.1088817Z Running tests... 2022-11-23T01:46:56.1089085Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1089400Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1089819Z test_monitored_barrier_failure_order (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:56.1089842Z 2022-11-23T01:46:56.1090106Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1090220Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1090240Z 2022-11-23T01:46:56.1090352Z OK (skipped=1) 2022-11-23T01:46:56.1090371Z 2022-11-23T01:46:56.1090602Z Generating XML reports... 2022-11-23T01:46:56.1091044Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014036.xml 2022-11-23T01:46:56.1091418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1091594Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1091979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1092177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1092241Z 2022-11-23T01:46:56.1092348Z Running tests... 2022-11-23T01:46:56.1092615Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1092931Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1093332Z test_monitored_barrier_gloo (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:56.1093353Z 2022-11-23T01:46:56.1093598Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1093710Z Ran 1 test in 0.003s 2022-11-23T01:46:56.1093730Z 2022-11-23T01:46:56.1093838Z OK (skipped=1) 2022-11-23T01:46:56.1093858Z 2022-11-23T01:46:56.1093984Z Generating XML reports... 2022-11-23T01:46:56.1094429Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014039.xml 2022-11-23T01:46:56.1094800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1094983Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1095368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1095547Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1095584Z 2022-11-23T01:46:56.1095675Z Running tests... 2022-11-23T01:46:56.1095935Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1096253Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1096677Z test_monitored_barrier_gloo_rank_0_timeout (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:56.1096696Z 2022-11-23T01:46:56.1096958Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1097071Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1097095Z 2022-11-23T01:46:56.1097205Z OK (skipped=1) 2022-11-23T01:46:56.1097224Z 2022-11-23T01:46:56.1097349Z Generating XML reports... 2022-11-23T01:46:56.1097780Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014041.xml 2022-11-23T01:46:56.1098161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1098341Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1098723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1098917Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1098937Z 2022-11-23T01:46:56.1099046Z Running tests... 2022-11-23T01:46:56.1099309Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1099625Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1100045Z test_monitored_barrier_gloo_subgroup (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:56.1100065Z 2022-11-23T01:46:56.1100354Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1100474Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1100493Z 2022-11-23T01:46:56.1100600Z OK (skipped=1) 2022-11-23T01:46:56.1100619Z 2022-11-23T01:46:56.1100743Z Generating XML reports... 2022-11-23T01:46:56.1101190Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014044.xml 2022-11-23T01:46:56.1101563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1101739Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1102123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1102362Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1102382Z 2022-11-23T01:46:56.1102472Z Running tests... 2022-11-23T01:46:56.1102740Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1103055Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1103470Z test_monitored_barrier_wait_all_ranks (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:56.1103490Z 2022-11-23T01:46:56.1103748Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1103862Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1103881Z 2022-11-23T01:46:56.1103989Z OK (skipped=1) 2022-11-23T01:46:56.1104008Z 2022-11-23T01:46:56.1104133Z Generating XML reports... 2022-11-23T01:46:56.1104585Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014046.xml 2022-11-23T01:46:56.1104944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1105127Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1105516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1105712Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1105732Z 2022-11-23T01:46:56.1105841Z Running tests... 2022-11-23T01:46:56.1106104Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1106420Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1106827Z test_nccl_backend_bool_allgather (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.003s) 2022-11-23T01:46:56.1106851Z 2022-11-23T01:46:56.1107114Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1107208Z Ran 1 test in 0.003s 2022-11-23T01:46:56.1107227Z 2022-11-23T01:46:56.1107337Z OK (skipped=1) 2022-11-23T01:46:56.1107360Z 2022-11-23T01:46:56.1107484Z Generating XML reports... 2022-11-23T01:46:56.1107985Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014048.xml 2022-11-23T01:46:56.1108361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1108541Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1108927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1109340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1109366Z 2022-11-23T01:46:56.1109456Z Running tests... 2022-11-23T01:46:56.1109719Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1110031Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1110510Z test_nccl_backend_bool_allreduce (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.003s) 2022-11-23T01:46:56.1110533Z 2022-11-23T01:46:56.1110804Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1110917Z Ran 1 test in 0.003s 2022-11-23T01:46:56.1110936Z 2022-11-23T01:46:56.1111044Z OK (skipped=1) 2022-11-23T01:46:56.1111064Z 2022-11-23T01:46:56.1111223Z Generating XML reports... 2022-11-23T01:46:56.1111672Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014051.xml 2022-11-23T01:46:56.1112030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1112272Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1112661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1112855Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1112874Z 2022-11-23T01:46:56.1112984Z Running tests... 2022-11-23T01:46:56.1113247Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1113562Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1113968Z test_nccl_backend_bool_broadcast (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.002s) 2022-11-23T01:46:56.1113988Z 2022-11-23T01:46:56.1114251Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1114350Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1114370Z 2022-11-23T01:46:56.1114478Z OK (skipped=1) 2022-11-23T01:46:56.1114497Z 2022-11-23T01:46:56.1114622Z Generating XML reports... 2022-11-23T01:46:56.1115076Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014053.xml 2022-11-23T01:46:56.1115454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1115632Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1116017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1116211Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1116230Z 2022-11-23T01:46:56.1116338Z Running tests... 2022-11-23T01:46:56.1116585Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1116904Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1117305Z test_nccl_backend_bool_reduce (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.003s) 2022-11-23T01:46:56.1117325Z 2022-11-23T01:46:56.1117591Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1117704Z Ran 1 test in 0.003s 2022-11-23T01:46:56.1117723Z 2022-11-23T01:46:56.1117831Z OK (skipped=1) 2022-11-23T01:46:56.1117850Z 2022-11-23T01:46:56.1117974Z Generating XML reports... 2022-11-23T01:46:56.1118424Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014056.xml 2022-11-23T01:46:56.1118782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1118961Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1119349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1119544Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1119563Z 2022-11-23T01:46:56.1119672Z Running tests... 2022-11-23T01:46:56.1119982Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1120305Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1120609Z test_nccl_high_priority_stream (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL backend supports high priority stream (0.002s) 2022-11-23T01:46:56.1120629Z 2022-11-23T01:46:56.1120893Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1120987Z Ran 1 test in 0.003s 2022-11-23T01:46:56.1121025Z 2022-11-23T01:46:56.1121115Z OK (skipped=1) 2022-11-23T01:46:56.1121134Z 2022-11-23T01:46:56.1121305Z Generating XML reports... 2022-11-23T01:46:56.1121754Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014058.xml 2022-11-23T01:46:56.1122127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1122307Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1122693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1122887Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1122907Z 2022-11-23T01:46:56.1123016Z Running tests... 2022-11-23T01:46:56.1123262Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1123577Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1123832Z test_new_subgroups (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T01:46:56.1123855Z 2022-11-23T01:46:56.1124121Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1124233Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1124252Z 2022-11-23T01:46:56.1124359Z OK (skipped=1) 2022-11-23T01:46:56.1124382Z 2022-11-23T01:46:56.1124506Z Generating XML reports... 2022-11-23T01:46:56.1124955Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014100.xml 2022-11-23T01:46:56.1125332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1125492Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1125878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1126071Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1126095Z 2022-11-23T01:46:56.1126203Z Running tests... 2022-11-23T01:46:56.1126466Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1126780Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1127058Z test_new_subgroups_by_enumeration (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T01:46:56.1127078Z 2022-11-23T01:46:56.1127345Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1127438Z Ran 1 test in 0.003s 2022-11-23T01:46:56.1127476Z 2022-11-23T01:46:56.1127565Z OK (skipped=1) 2022-11-23T01:46:56.1127584Z 2022-11-23T01:46:56.1127705Z Generating XML reports... 2022-11-23T01:46:56.1128152Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014103.xml 2022-11-23T01:46:56.1128525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1128708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1129140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1129337Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1129357Z 2022-11-23T01:46:56.1129466Z Running tests... 2022-11-23T01:46:56.1129712Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1130026Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1130342Z test_new_subgroups_by_enumeration_input_rank_exceeds_world_size (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T01:46:56.1130362Z 2022-11-23T01:46:56.1130623Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1130779Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1130799Z 2022-11-23T01:46:56.1130907Z OK (skipped=1) 2022-11-23T01:46:56.1130926Z 2022-11-23T01:46:56.1131050Z Generating XML reports... 2022-11-23T01:46:56.1131505Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014105.xml 2022-11-23T01:46:56.1131883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1132041Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1132426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1132625Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1132645Z 2022-11-23T01:46:56.1132753Z Running tests... 2022-11-23T01:46:56.1133016Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1133337Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1133646Z test_new_subgroups_by_enumeration_negative_input_rank (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1133876Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15515 2022-11-23T01:46:56.1134083Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15516 2022-11-23T01:46:56.1134460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1134637Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1135024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1135217Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1135593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1135774Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1136160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1136354Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1136586Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1136836Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1137244Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1137646Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1137884Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1138115Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1138218Z ok (4.229s) 2022-11-23T01:46:56.1138239Z 2022-11-23T01:46:56.1138553Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1138670Z Ran 1 test in 4.229s 2022-11-23T01:46:56.1138690Z 2022-11-23T01:46:56.1138764Z OK 2022-11-23T01:46:56.1138782Z 2022-11-23T01:46:56.1138907Z Generating XML reports... 2022-11-23T01:46:56.1139361Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014108.xml 2022-11-23T01:46:56.1139736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1139913Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1140344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1140539Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1140558Z 2022-11-23T01:46:56.1140668Z Running tests... 2022-11-23T01:46:56.1140920Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1141238Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1141536Z test_new_subgroups_group_size_exceeds_world_size (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1141760Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15618 2022-11-23T01:46:56.1141980Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15619 2022-11-23T01:46:56.1142353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1142535Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1142941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1143139Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1143497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1143674Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1144060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1144253Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1144504Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1144755Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1145162Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1145569Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1145804Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1146017Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1146120Z ok (4.366s) 2022-11-23T01:46:56.1146139Z 2022-11-23T01:46:56.1146407Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1146519Z Ran 1 test in 4.366s 2022-11-23T01:46:56.1146539Z 2022-11-23T01:46:56.1146631Z OK 2022-11-23T01:46:56.1146650Z 2022-11-23T01:46:56.1146774Z Generating XML reports... 2022-11-23T01:46:56.1147228Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014114.xml 2022-11-23T01:46:56.1147602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1147817Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1148215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1148402Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1148422Z 2022-11-23T01:46:56.1148531Z Running tests... 2022-11-23T01:46:56.1148805Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1149320Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1149590Z test_new_subgroups_overlap_not_allowed (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T01:46:56.1149701Z 2022-11-23T01:46:56.1149953Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1150065Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1150085Z 2022-11-23T01:46:56.1150196Z OK (skipped=1) 2022-11-23T01:46:56.1150219Z 2022-11-23T01:46:56.1150341Z Generating XML reports... 2022-11-23T01:46:56.1150788Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014121.xml 2022-11-23T01:46:56.1151164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1151342Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1151725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1151902Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1151925Z 2022-11-23T01:46:56.1152035Z Running tests... 2022-11-23T01:46:56.1152298Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1152613Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1152922Z test_new_subgroups_world_size_not_divisible_by_group_size (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T01:46:56.1152942Z 2022-11-23T01:46:56.1153203Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1153312Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1153332Z 2022-11-23T01:46:56.1153440Z OK (skipped=1) 2022-11-23T01:46:56.1153459Z 2022-11-23T01:46:56.1153583Z Generating XML reports... 2022-11-23T01:46:56.1154013Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014124.xml 2022-11-23T01:46:56.1154390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1154569Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1154956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1155151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1155171Z 2022-11-23T01:46:56.1155278Z Running tests... 2022-11-23T01:46:56.1155539Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1155852Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1156118Z test_output_unused_in_loss_dict_module (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1156879Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78112 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.626s) 2022-11-23T01:46:56.1156925Z 2022-11-23T01:46:56.1157169Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1157358Z Ran 1 test in 1.626s 2022-11-23T01:46:56.1157380Z 2022-11-23T01:46:56.1157494Z OK (skipped=1) 2022-11-23T01:46:56.1157513Z 2022-11-23T01:46:56.1157636Z Generating XML reports... 2022-11-23T01:46:56.1158086Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014126.xml 2022-11-23T01:46:56.1158464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1158643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1159024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1159249Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1159284Z 2022-11-23T01:46:56.1159375Z Running tests... 2022-11-23T01:46:56.1159639Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1159953Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1160229Z test_output_unused_in_loss_tuple_module (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1160441Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15821 2022-11-23T01:46:56.1160652Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15822 2022-11-23T01:46:56.1161016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1161180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1161550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1161859Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1162292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1162457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1162836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1163028Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1163278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1163528Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1163920Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1164431Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1164670Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1164904Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1165164Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdjff3nk9 2022-11-23T01:46:56.1165440Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdjff3nk9/_remote_module_non_scriptable.py 2022-11-23T01:46:56.1165692Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppg4yfc7g 2022-11-23T01:46:56.1165963Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppg4yfc7g/_remote_module_non_scriptable.py 2022-11-23T01:46:56.1166242Z [1669167695.966764] [08317a7e7676:15821:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1166521Z [1669167695.980317] [08317a7e7676:15821:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1166768Z [1669167695.980317] [08317a7e7676:15821:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1167043Z [1669167695.976634] [08317a7e7676:15822:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1167272Z [1669167695.989872] [08317a7e7676:15822:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1167509Z [1669167695.989872] [08317a7e7676:15822:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1167660Z ok (6.656s) 2022-11-23T01:46:56.1167680Z 2022-11-23T01:46:56.1167953Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1168067Z Ran 1 test in 6.657s 2022-11-23T01:46:56.1168087Z 2022-11-23T01:46:56.1168179Z OK 2022-11-23T01:46:56.1168201Z 2022-11-23T01:46:56.1168327Z Generating XML reports... 2022-11-23T01:46:56.1168763Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014130.xml 2022-11-23T01:46:56.1169142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1169318Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1169706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1169899Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1169921Z 2022-11-23T01:46:56.1170028Z Running tests... 2022-11-23T01:46:56.1170296Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1170613Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1170875Z test_periodic_model_averager (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1171096Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 15939 2022-11-23T01:46:56.1171318Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 15940 2022-11-23T01:46:56.1171691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1171870Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1172258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1172456Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1172833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1173013Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1173378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1173569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1173814Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1174064Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1174470Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1174878Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1175113Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1175392Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1175674Z [1669167706.131945] [08317a7e7676:15939:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1175932Z [1669167706.133552] [08317a7e7676:15940:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1176162Z [1669167706.146181] [08317a7e7676:15939:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1176400Z [1669167706.146181] [08317a7e7676:15939:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1176665Z [1669167706.146184] [08317a7e7676:15940:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1176900Z [1669167706.146184] [08317a7e7676:15940:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1177004Z ok (7.158s) 2022-11-23T01:46:56.1177023Z 2022-11-23T01:46:56.1177292Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1177406Z Ran 1 test in 7.158s 2022-11-23T01:46:56.1177425Z 2022-11-23T01:46:56.1177518Z OK 2022-11-23T01:46:56.1177537Z 2022-11-23T01:46:56.1177645Z Generating XML reports... 2022-11-23T01:46:56.1178099Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014139.xml 2022-11-23T01:46:56.1178476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1178659Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1179042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1179236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1179265Z 2022-11-23T01:46:56.1179385Z Running tests... 2022-11-23T01:46:56.1179650Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1179968Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1180242Z test_periodic_model_averager_param_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1180466Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16054 2022-11-23T01:46:56.1180685Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16055 2022-11-23T01:46:56.1181067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1181244Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1181632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1181827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1182199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1182359Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1182741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1182933Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1183186Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1183433Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1183889Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1184302Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1184537Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1184769Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1185025Z [1669167715.872855] [08317a7e7676:16055:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1185300Z [1669167715.872239] [08317a7e7676:16054:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1185578Z [1669167715.886176] [08317a7e7676:16054:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1185819Z [1669167715.886176] [08317a7e7676:16054:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1186044Z [1669167715.885884] [08317a7e7676:16055:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1186273Z [1669167715.885884] [08317a7e7676:16055:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1186378Z ok (7.180s) 2022-11-23T01:46:56.1186398Z 2022-11-23T01:46:56.1186665Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1186777Z Ran 1 test in 7.180s 2022-11-23T01:46:56.1186799Z 2022-11-23T01:46:56.1186873Z OK 2022-11-23T01:46:56.1186910Z 2022-11-23T01:46:56.1187017Z Generating XML reports... 2022-11-23T01:46:56.1187489Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014149.xml 2022-11-23T01:46:56.1187874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1188052Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1188438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1188632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1188652Z 2022-11-23T01:46:56.1188760Z Running tests... 2022-11-23T01:46:56.1189185Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1189493Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1189783Z test_post_localSGD_optimizer_parity (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1190546Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77123 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.650s) 2022-11-23T01:46:56.1190567Z 2022-11-23T01:46:56.1190833Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1190945Z Ran 1 test in 1.650s 2022-11-23T01:46:56.1190964Z 2022-11-23T01:46:56.1191072Z OK (skipped=1) 2022-11-23T01:46:56.1191091Z 2022-11-23T01:46:56.1191213Z Generating XML reports... 2022-11-23T01:46:56.1191665Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014159.xml 2022-11-23T01:46:56.1192047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1192221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1192659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1192860Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1192880Z 2022-11-23T01:46:56.1192988Z Running tests... 2022-11-23T01:46:56.1193254Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1193567Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1193865Z test_post_localSGD_optimizer_parity_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1194614Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77292 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.647s) 2022-11-23T01:46:56.1194690Z 2022-11-23T01:46:56.1194964Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1195081Z Ran 1 test in 1.648s 2022-11-23T01:46:56.1195101Z 2022-11-23T01:46:56.1195189Z OK (skipped=1) 2022-11-23T01:46:56.1195227Z 2022-11-23T01:46:56.1195333Z Generating XML reports... 2022-11-23T01:46:56.1195781Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014203.xml 2022-11-23T01:46:56.1196161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1196336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1196720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1196915Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1196935Z 2022-11-23T01:46:56.1197043Z Running tests... 2022-11-23T01:46:56.1197309Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1197607Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1197925Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1198147Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16237 2022-11-23T01:46:56.1198368Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16238 2022-11-23T01:46:56.1198745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1198925Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1199308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1199503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1199873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1200031Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1200411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1200603Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1200851Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1201098Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1201508Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1201955Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1202196Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1202431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1202565Z skip: Need at least 4 CUDA devices (4.237s) 2022-11-23T01:46:56.1202587Z 2022-11-23T01:46:56.1202853Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1202963Z Ran 1 test in 4.237s 2022-11-23T01:46:56.1202983Z 2022-11-23T01:46:56.1203087Z OK (skipped=1) 2022-11-23T01:46:56.1203107Z 2022-11-23T01:46:56.1203231Z Generating XML reports... 2022-11-23T01:46:56.1203729Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014207.xml 2022-11-23T01:46:56.1204105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1204286Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1204652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1204846Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1204866Z 2022-11-23T01:46:56.1204976Z Running tests... 2022-11-23T01:46:56.1205239Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1205554Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1205886Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1206113Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16340 2022-11-23T01:46:56.1206333Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16341 2022-11-23T01:46:56.1206711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1206870Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1207250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1207441Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1207813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1207991Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1208369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1208558Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1208809Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1209039Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1209443Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1209845Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1210077Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1210312Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1210467Z skip: Need at least 4 CUDA devices (4.215s) 2022-11-23T01:46:56.1210486Z 2022-11-23T01:46:56.1210752Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1210862Z Ran 1 test in 4.215s 2022-11-23T01:46:56.1210881Z 2022-11-23T01:46:56.1211035Z OK (skipped=1) 2022-11-23T01:46:56.1211056Z 2022-11-23T01:46:56.1211164Z Generating XML reports... 2022-11-23T01:46:56.1211666Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014214.xml 2022-11-23T01:46:56.1212044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1212219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1212606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1212850Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1212869Z 2022-11-23T01:46:56.1212977Z Running tests... 2022-11-23T01:46:56.1213240Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1213558Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1213833Z test_post_localSGD_optimizer_step_reload (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1214592Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/84886 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.608s) 2022-11-23T01:46:56.1214612Z 2022-11-23T01:46:56.1214875Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1214987Z Ran 1 test in 1.608s 2022-11-23T01:46:56.1215010Z 2022-11-23T01:46:56.1215118Z OK (skipped=1) 2022-11-23T01:46:56.1215137Z 2022-11-23T01:46:56.1215261Z Generating XML reports... 2022-11-23T01:46:56.1215712Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014221.xml 2022-11-23T01:46:56.1216091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1216270Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1216639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1216833Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1216853Z 2022-11-23T01:46:56.1216962Z Running tests... 2022-11-23T01:46:56.1217226Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1217539Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1217811Z test_reduce_full_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1218034Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16477 2022-11-23T01:46:56.1218258Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16478 2022-11-23T01:46:56.1218635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1218795Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1219178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1219369Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1219741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1219922Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1220309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1220563Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1220819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1221050Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1221458Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1221860Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1222092Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1222384Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.1222611Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1222852Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.1223254Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1223653Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1223992Z STAGE:2022-11-23 01:42:29 16477:16477 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1224301Z STAGE:2022-11-23 01:42:29 16478:16478 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1224580Z [1669167749.456310] [08317a7e7676:16478:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1224818Z [1669167751.073435] [08317a7e7676:16478:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1225059Z [1669167751.073435] [08317a7e7676:16478:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1225338Z [1669167749.435065] [08317a7e7676:16477:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1225567Z [1669167751.110411] [08317a7e7676:16477:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1225803Z [1669167751.110411] [08317a7e7676:16477:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1226359Z STAGE:2022-11-23 01:42:31 16478:16478 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:42:31 16477:16477 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1226383Z 2022-11-23T01:46:56.1226740Z STAGE:2022-11-23 01:42:31 16478:16478 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1227092Z STAGE:2022-11-23 01:42:31 16477:16477 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1227403Z STAGE:2022-11-23 01:42:31 16478:16478 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1227723Z STAGE:2022-11-23 01:42:31 16477:16477 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1228059Z STAGE:2022-11-23 01:42:31 16478:16478 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1228403Z STAGE:2022-11-23 01:42:31 16478:16478 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1228741Z STAGE:2022-11-23 01:42:31 16477:16477 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1229245Z STAGE:2022-11-23 01:42:31 16477:16477 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1229352Z ok (6.680s) 2022-11-23T01:46:56.1229440Z 2022-11-23T01:46:56.1229720Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1229834Z Ran 1 test in 6.680s 2022-11-23T01:46:56.1229854Z 2022-11-23T01:46:56.1229928Z OK 2022-11-23T01:46:56.1229947Z 2022-11-23T01:46:56.1230073Z Generating XML reports... 2022-11-23T01:46:56.1230527Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014225.xml 2022-11-23T01:46:56.1230904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1231083Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1231592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1231786Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1231807Z 2022-11-23T01:46:56.1231916Z Running tests... 2022-11-23T01:46:56.1232184Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1232485Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1232755Z test_reduce_full_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1232979Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16591 2022-11-23T01:46:56.1233198Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16592 2022-11-23T01:46:56.1233575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1233758Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1234144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1234340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1234696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1234873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1235256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1235446Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1235693Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1235940Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1236354Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1236761Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1236996Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1237221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.1237449Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1237685Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.1238084Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1238484Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1238824Z STAGE:2022-11-23 01:42:38 16591:16591 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1239201Z STAGE:2022-11-23 01:42:38 16592:16592 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1239486Z [1669167758.608426] [08317a7e7676:16592:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1239721Z [1669167760.247829] [08317a7e7676:16592:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1239942Z [1669167760.247829] [08317a7e7676:16592:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1240218Z [1669167758.605791] [08317a7e7676:16591:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1240493Z [1669167760.279728] [08317a7e7676:16591:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1240734Z [1669167760.279728] [08317a7e7676:16591:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1241292Z STAGE:2022-11-23 01:42:40 16592:16592 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:42:40 16591:16591 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1241313Z 2022-11-23T01:46:56.1241668Z STAGE:2022-11-23 01:42:40 16592:16592 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1242018Z STAGE:2022-11-23 01:42:40 16591:16591 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1242346Z STAGE:2022-11-23 01:42:40 16591:16591 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1242672Z STAGE:2022-11-23 01:42:40 16592:16592 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1243006Z STAGE:2022-11-23 01:42:40 16592:16592 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1243571Z STAGE:2022-11-23 01:42:40 16591:16591 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:42:40 16592:16592 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1243593Z 2022-11-23T01:46:56.1243946Z STAGE:2022-11-23 01:42:40 16591:16591 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1244032Z ok (6.673s) 2022-11-23T01:46:56.1244051Z 2022-11-23T01:46:56.1244318Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1244431Z Ran 1 test in 6.673s 2022-11-23T01:46:56.1244450Z 2022-11-23T01:46:56.1244542Z OK 2022-11-23T01:46:56.1244565Z 2022-11-23T01:46:56.1244688Z Generating XML reports... 2022-11-23T01:46:56.1245145Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014234.xml 2022-11-23T01:46:56.1245527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1245708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1246079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1246273Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1246293Z 2022-11-23T01:46:56.1246398Z Running tests... 2022-11-23T01:46:56.1246660Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1246976Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1247254Z test_reduce_full_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1247477Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16705 2022-11-23T01:46:56.1247750Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16706 2022-11-23T01:46:56.1248120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1248298Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1248682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1248875Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1249247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1249424Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1249859Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1250052Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1250306Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1250538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1250945Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1251347Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1251580Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1251822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.1252053Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1252289Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.1252691Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1253087Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1253408Z STAGE:2022-11-23 01:42:47 16705:16705 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1253743Z STAGE:2022-11-23 01:42:47 16706:16706 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1254022Z [1669167767.795519] [08317a7e7676:16706:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1254254Z [1669167769.422218] [08317a7e7676:16706:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1254497Z [1669167769.422218] [08317a7e7676:16706:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1254775Z [1669167767.789172] [08317a7e7676:16705:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1255005Z [1669167769.421036] [08317a7e7676:16705:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1255240Z [1669167769.421036] [08317a7e7676:16705:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1255794Z STAGE:2022-11-23 01:42:49 16706:16706 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:42:49 16705:16705 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1255819Z 2022-11-23T01:46:56.1256175Z STAGE:2022-11-23 01:42:49 16706:16706 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1256573Z STAGE:2022-11-23 01:42:49 16705:16705 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1256897Z STAGE:2022-11-23 01:42:49 16706:16706 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1257223Z STAGE:2022-11-23 01:42:49 16705:16705 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1257563Z STAGE:2022-11-23 01:42:49 16706:16706 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1257892Z STAGE:2022-11-23 01:42:49 16705:16705 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1258238Z STAGE:2022-11-23 01:42:49 16706:16706 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1258634Z STAGE:2022-11-23 01:42:49 16705:16705 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1258736Z ok (6.609s) 2022-11-23T01:46:56.1258756Z 2022-11-23T01:46:56.1259027Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1259142Z Ran 1 test in 6.610s 2022-11-23T01:46:56.1259161Z 2022-11-23T01:46:56.1259235Z OK 2022-11-23T01:46:56.1259254Z 2022-11-23T01:46:56.1259377Z Generating XML reports... 2022-11-23T01:46:56.1259832Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014243.xml 2022-11-23T01:46:56.1260209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1260388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1260774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1260972Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1260991Z 2022-11-23T01:46:56.1261100Z Running tests... 2022-11-23T01:46:56.1261349Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1261666Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1261933Z test_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1262154Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16819 2022-11-23T01:46:56.1262374Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16820 2022-11-23T01:46:56.1262750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1262926Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1263317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1263511Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1263873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1264050Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1264430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1264619Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1264868Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1265117Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1265530Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1265931Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1266193Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1266442Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.1266666Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1266904Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.1267305Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1267698Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1268088Z STAGE:2022-11-23 01:42:56 16820:16820 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1268413Z STAGE:2022-11-23 01:42:56 16819:16819 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1268692Z [1669167776.934421] [08317a7e7676:16820:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1268924Z [1669167778.584905] [08317a7e7676:16820:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1269321Z [1669167778.584905] [08317a7e7676:16820:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1269595Z [1669167776.913800] [08317a7e7676:16819:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1269829Z [1669167778.540177] [08317a7e7676:16819:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1270066Z [1669167778.540177] [08317a7e7676:16819:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1270630Z STAGE:2022-11-23 01:42:58 16820:16820 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:42:58 16819:16819 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1270653Z 2022-11-23T01:46:56.1271227Z STAGE:2022-11-23 01:42:58 16819:16819 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:42:58 16820:16820 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1271248Z 2022-11-23T01:46:56.1271580Z STAGE:2022-11-23 01:42:59 16820:16820 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1271904Z STAGE:2022-11-23 01:42:59 16819:16819 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1272247Z STAGE:2022-11-23 01:42:59 16820:16820 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1272808Z STAGE:2022-11-23 01:42:59 16820:16820 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:42:59 16819:16819 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1272828Z 2022-11-23T01:46:56.1273180Z STAGE:2022-11-23 01:42:59 16819:16819 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1273282Z ok (6.533s) 2022-11-23T01:46:56.1273302Z 2022-11-23T01:46:56.1273552Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1273668Z Ran 1 test in 6.533s 2022-11-23T01:46:56.1273687Z 2022-11-23T01:46:56.1273778Z OK 2022-11-23T01:46:56.1273797Z 2022-11-23T01:46:56.1273920Z Generating XML reports... 2022-11-23T01:46:56.1274377Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014253.xml 2022-11-23T01:46:56.1274757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1275006Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1275406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1275583Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1275622Z 2022-11-23T01:46:56.1275713Z Running tests... 2022-11-23T01:46:56.1275981Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1276294Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1276555Z test_reduce_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1276840Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 16933 2022-11-23T01:46:56.1277061Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 16934 2022-11-23T01:46:56.1277444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1277622Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1277989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1278180Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1278548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1278724Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1279106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1279301Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1279549Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1279801Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1280191Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1280598Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1280833Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1281062Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1281225Z skip: Skipped due to small world size. (4.257s) 2022-11-23T01:46:56.1281248Z 2022-11-23T01:46:56.1281517Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1281630Z Ran 1 test in 4.257s 2022-11-23T01:46:56.1281650Z 2022-11-23T01:46:56.1281761Z OK (skipped=1) 2022-11-23T01:46:56.1281780Z 2022-11-23T01:46:56.1281909Z Generating XML reports... 2022-11-23T01:46:56.1282345Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014302.xml 2022-11-23T01:46:56.1282716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1282892Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1283277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1283470Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1283494Z 2022-11-23T01:46:56.1283602Z Running tests... 2022-11-23T01:46:56.1283862Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1284180Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1284500Z test_reduce_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1284710Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17036 2022-11-23T01:46:56.1284933Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17037 2022-11-23T01:46:56.1285309Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1285486Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1285869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1286109Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1286482Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1286662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1287029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1287224Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1287471Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1287719Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1288127Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1288533Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1288766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1288999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1289161Z skip: Skipped due to small world size. (4.367s) 2022-11-23T01:46:56.1289182Z 2022-11-23T01:46:56.1289430Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1289542Z Ran 1 test in 4.368s 2022-11-23T01:46:56.1289561Z 2022-11-23T01:46:56.1289667Z OK (skipped=1) 2022-11-23T01:46:56.1289686Z 2022-11-23T01:46:56.1289808Z Generating XML reports... 2022-11-23T01:46:56.1290255Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014308.xml 2022-11-23T01:46:56.1290624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1290806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1291189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1291385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1291405Z 2022-11-23T01:46:56.1291496Z Running tests... 2022-11-23T01:46:56.1291762Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1292077Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1292347Z test_reduce_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1292569Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17139 2022-11-23T01:46:56.1292792Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17140 2022-11-23T01:46:56.1293166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1293342Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1293752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1293952Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1294330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1294505Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1294887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1295078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1295375Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1295622Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1296032Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1296415Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1296648Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1296880Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1297040Z skip: Skipped due to small world size. (4.271s) 2022-11-23T01:46:56.1297060Z 2022-11-23T01:46:56.1297329Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1297447Z Ran 1 test in 4.271s 2022-11-23T01:46:56.1297467Z 2022-11-23T01:46:56.1297573Z OK (skipped=1) 2022-11-23T01:46:56.1297592Z 2022-11-23T01:46:56.1297715Z Generating XML reports... 2022-11-23T01:46:56.1298151Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014315.xml 2022-11-23T01:46:56.1298530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1298706Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1299090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1299282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1299302Z 2022-11-23T01:46:56.1299410Z Running tests... 2022-11-23T01:46:56.1299674Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1299996Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1300255Z test_reduce_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1300463Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17242 2022-11-23T01:46:56.1300683Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17243 2022-11-23T01:46:56.1301054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1301229Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1301607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1301800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1302172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1302349Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1302713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1302949Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1303202Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1303449Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1303854Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1304254Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1304487Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1304768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1304930Z skip: Skipped due to small world size. (4.270s) 2022-11-23T01:46:56.1304949Z 2022-11-23T01:46:56.1305203Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1305314Z Ran 1 test in 4.271s 2022-11-23T01:46:56.1305333Z 2022-11-23T01:46:56.1305438Z OK (skipped=1) 2022-11-23T01:46:56.1305457Z 2022-11-23T01:46:56.1305580Z Generating XML reports... 2022-11-23T01:46:56.1306030Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014322.xml 2022-11-23T01:46:56.1306404Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1306581Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1306962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1307157Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1307177Z 2022-11-23T01:46:56.1307268Z Running tests... 2022-11-23T01:46:56.1307533Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1307848Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1308100Z test_reduce_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1308320Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17345 2022-11-23T01:46:56.1308539Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17346 2022-11-23T01:46:56.1308913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1309252Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1309623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1309815Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1310188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1310365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1310747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1310938Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1311183Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1311462Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1311875Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1312336Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1312581Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1312922Z STAGE:2022-11-23 01:43:33 17346:17346 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1313153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1313489Z STAGE:2022-11-23 01:43:33 17345:17345 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1313769Z [1669167813.545321] [08317a7e7676:17346:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1314064Z [1669167815.164007] [08317a7e7676:17346:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1314303Z [1669167815.164007] [08317a7e7676:17346:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1314580Z [1669167813.523843] [08317a7e7676:17345:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1314810Z [1669167815.193832] [08317a7e7676:17345:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1315031Z [1669167815.193832] [08317a7e7676:17345:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1315591Z STAGE:2022-11-23 01:43:35 17346:17346 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:43:35 17345:17345 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1315617Z 2022-11-23T01:46:56.1315969Z STAGE:2022-11-23 01:43:35 17346:17346 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1316321Z STAGE:2022-11-23 01:43:35 17345:17345 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1316650Z STAGE:2022-11-23 01:43:35 17346:17346 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1316974Z STAGE:2022-11-23 01:43:35 17345:17345 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1317307Z STAGE:2022-11-23 01:43:35 17346:17346 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1317655Z STAGE:2022-11-23 01:43:35 17346:17346 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1317990Z STAGE:2022-11-23 01:43:35 17345:17345 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1318339Z STAGE:2022-11-23 01:43:35 17345:17345 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1318425Z ok (6.664s) 2022-11-23T01:46:56.1318444Z 2022-11-23T01:46:56.1318712Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1318828Z Ran 1 test in 6.665s 2022-11-23T01:46:56.1318848Z 2022-11-23T01:46:56.1318940Z OK 2022-11-23T01:46:56.1318960Z 2022-11-23T01:46:56.1319083Z Generating XML reports... 2022-11-23T01:46:56.1319537Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014329.xml 2022-11-23T01:46:56.1319916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1320093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1320461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1320657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1320676Z 2022-11-23T01:46:56.1320788Z Running tests... 2022-11-23T01:46:56.1321053Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1321417Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1321674Z test_reduce_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1321901Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17459 2022-11-23T01:46:56.1322119Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17460 2022-11-23T01:46:56.1322496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1322657Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1323039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1323279Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1323656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1323834Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1324214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1324405Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1324651Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1324881Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1325284Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1325689Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1325923Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1326263Z STAGE:2022-11-23 01:43:42 17459:17459 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1326495Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1326833Z STAGE:2022-11-23 01:43:42 17460:17460 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1327112Z [1669167822.752854] [08317a7e7676:17459:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1327342Z [1669167824.430974] [08317a7e7676:17459:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1327568Z [1669167824.430974] [08317a7e7676:17459:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1327847Z [1669167822.774426] [08317a7e7676:17460:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1328077Z [1669167824.417581] [08317a7e7676:17460:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1328314Z [1669167824.417581] [08317a7e7676:17460:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1328873Z STAGE:2022-11-23 01:43:44 17459:17459 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:43:44 17460:17460 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1328894Z 2022-11-23T01:46:56.1329246Z STAGE:2022-11-23 01:43:44 17460:17460 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1329600Z STAGE:2022-11-23 01:43:44 17459:17459 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1329975Z STAGE:2022-11-23 01:43:44 17459:17459 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1330322Z STAGE:2022-11-23 01:43:44 17459:17459 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1330669Z STAGE:2022-11-23 01:43:44 17459:17459 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1331000Z STAGE:2022-11-23 01:43:44 17460:17460 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1331318Z STAGE:2022-11-23 01:43:44 17460:17460 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1331665Z STAGE:2022-11-23 01:43:44 17460:17460 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1331819Z ok (6.854s) 2022-11-23T01:46:56.1331838Z 2022-11-23T01:46:56.1332104Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1332218Z Ran 1 test in 6.854s 2022-11-23T01:46:56.1332238Z 2022-11-23T01:46:56.1332331Z OK 2022-11-23T01:46:56.1332350Z 2022-11-23T01:46:56.1332479Z Generating XML reports... 2022-11-23T01:46:56.1332932Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014338.xml 2022-11-23T01:46:56.1333294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1333473Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1333860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1334052Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1334076Z 2022-11-23T01:46:56.1334184Z Running tests... 2022-11-23T01:46:56.1334449Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1334767Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1335050Z test_reduce_multigpu (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl backend supports reduce multigpu (0.002s) 2022-11-23T01:46:56.1335070Z 2022-11-23T01:46:56.1335333Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1335427Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1335446Z 2022-11-23T01:46:56.1335554Z OK (skipped=1) 2022-11-23T01:46:56.1335573Z 2022-11-23T01:46:56.1335696Z Generating XML reports... 2022-11-23T01:46:56.1336146Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014348.xml 2022-11-23T01:46:56.1336523Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1336703Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1337088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1337284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1337304Z 2022-11-23T01:46:56.1337410Z Running tests... 2022-11-23T01:46:56.1337658Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1337971Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1338229Z test_reduce_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1338453Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17606 2022-11-23T01:46:56.1338673Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17607 2022-11-23T01:46:56.1339054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1339234Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1339666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1339847Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1340224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1340401Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1340783Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1340974Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1341219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1341515Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1341923Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1342324Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1342539Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1342878Z STAGE:2022-11-23 01:43:54 17606:17606 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1343110Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1343444Z STAGE:2022-11-23 01:43:54 17607:17607 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1343723Z [1669167834.567193] [08317a7e7676:17607:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1343957Z [1669167836.248178] [08317a7e7676:17607:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1344194Z [1669167836.248178] [08317a7e7676:17607:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1344466Z [1669167834.566457] [08317a7e7676:17606:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1344694Z [1669167836.211256] [08317a7e7676:17606:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1344928Z [1669167836.211256] [08317a7e7676:17606:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1345472Z STAGE:2022-11-23 01:43:56 17607:17607 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:43:56 17606:17606 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1345512Z 2022-11-23T01:46:56.1345849Z STAGE:2022-11-23 01:43:56 17607:17607 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1346200Z STAGE:2022-11-23 01:43:56 17606:17606 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1346530Z STAGE:2022-11-23 01:43:56 17607:17607 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1346855Z STAGE:2022-11-23 01:43:56 17606:17606 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1347190Z STAGE:2022-11-23 01:43:56 17607:17607 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1347516Z STAGE:2022-11-23 01:43:56 17606:17606 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1347865Z STAGE:2022-11-23 01:43:56 17607:17607 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1348210Z STAGE:2022-11-23 01:43:56 17606:17606 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1348354Z ok (6.644s) 2022-11-23T01:46:56.1348394Z 2022-11-23T01:46:56.1348649Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1348763Z Ran 1 test in 6.644s 2022-11-23T01:46:56.1348783Z 2022-11-23T01:46:56.1348873Z OK 2022-11-23T01:46:56.1348892Z 2022-11-23T01:46:56.1349170Z Generating XML reports... 2022-11-23T01:46:56.1349632Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014350.xml 2022-11-23T01:46:56.1350009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1350190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1350811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1350999Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1351035Z 2022-11-23T01:46:56.1351130Z Running tests... 2022-11-23T01:46:56.1351399Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1351715Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1352011Z test_reduce_scatter_tensor_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA reduce_scatter_tensor (0.002s) 2022-11-23T01:46:56.1352031Z 2022-11-23T01:46:56.1352293Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1352404Z Ran 1 test in 0.003s 2022-11-23T01:46:56.1352423Z 2022-11-23T01:46:56.1352530Z OK (skipped=1) 2022-11-23T01:46:56.1352553Z 2022-11-23T01:46:56.1352678Z Generating XML reports... 2022-11-23T01:46:56.1353113Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014359.xml 2022-11-23T01:46:56.1353493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1353667Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1354048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1354242Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1354261Z 2022-11-23T01:46:56.1354370Z Running tests... 2022-11-23T01:46:56.1354637Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1355046Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1355313Z test_reduce_scatter_v_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports reduce_scatter_v (0.003s) 2022-11-23T01:46:56.1355349Z 2022-11-23T01:46:56.1355598Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1355709Z Ran 1 test in 0.003s 2022-11-23T01:46:56.1355728Z 2022-11-23T01:46:56.1355838Z OK (skipped=1) 2022-11-23T01:46:56.1355857Z 2022-11-23T01:46:56.1355978Z Generating XML reports... 2022-11-23T01:46:56.1356430Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014402.xml 2022-11-23T01:46:56.1356806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1356983Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1357363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1357544Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1357580Z 2022-11-23T01:46:56.1357671Z Running tests... 2022-11-23T01:46:56.1357936Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1358319Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1358574Z test_reduce_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1358797Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17786 2022-11-23T01:46:56.1359017Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17787 2022-11-23T01:46:56.1359395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1359573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1359937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1360178Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1360549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1360725Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1361106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1361294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1361539Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1361785Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1362175Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1362578Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1362812Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1363153Z STAGE:2022-11-23 01:44:08 17787:17787 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1363379Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1363711Z STAGE:2022-11-23 01:44:08 17786:17786 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1363987Z [1669167848.581807] [08317a7e7676:17787:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1364220Z [1669167850.221703] [08317a7e7676:17787:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1364463Z [1669167850.221703] [08317a7e7676:17787:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1364739Z [1669167848.581781] [08317a7e7676:17786:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1364954Z [1669167850.227255] [08317a7e7676:17786:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1365188Z [1669167850.227255] [08317a7e7676:17786:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1365744Z STAGE:2022-11-23 01:44:10 17787:17787 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:44:10 17786:17786 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1365766Z 2022-11-23T01:46:56.1366114Z STAGE:2022-11-23 01:44:10 17787:17787 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1366467Z STAGE:2022-11-23 01:44:10 17786:17786 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1366841Z STAGE:2022-11-23 01:44:10 17787:17787 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1367171Z STAGE:2022-11-23 01:44:10 17786:17786 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1367505Z STAGE:2022-11-23 01:44:10 17787:17787 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1368067Z STAGE:2022-11-23 01:44:10 17786:17786 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:44:10 17787:17787 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1368088Z 2022-11-23T01:46:56.1368433Z STAGE:2022-11-23 01:44:10 17786:17786 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1368585Z ok (6.582s) 2022-11-23T01:46:56.1368605Z 2022-11-23T01:46:56.1368860Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1368972Z Ran 1 test in 6.582s 2022-11-23T01:46:56.1368991Z 2022-11-23T01:46:56.1369082Z OK 2022-11-23T01:46:56.1369101Z 2022-11-23T01:46:56.1369226Z Generating XML reports... 2022-11-23T01:46:56.1369680Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014404.xml 2022-11-23T01:46:56.1370056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1370229Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1370609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1370784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1370821Z 2022-11-23T01:46:56.1370915Z Running tests... 2022-11-23T01:46:56.1371176Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1371487Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1371754Z test_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA reduce (0.002s) 2022-11-23T01:46:56.1371775Z 2022-11-23T01:46:56.1372034Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1372144Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1372163Z 2022-11-23T01:46:56.1372271Z OK (skipped=1) 2022-11-23T01:46:56.1372290Z 2022-11-23T01:46:56.1372410Z Generating XML reports... 2022-11-23T01:46:56.1372843Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014413.xml 2022-11-23T01:46:56.1373217Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1373391Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1373772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1373966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1373986Z 2022-11-23T01:46:56.1374093Z Running tests... 2022-11-23T01:46:56.1374354Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1374667Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1374921Z test_reduce_sum_cuda_twice (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA reduce (0.002s) 2022-11-23T01:46:56.1374954Z 2022-11-23T01:46:56.1375199Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1375306Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1375326Z 2022-11-23T01:46:56.1375433Z OK (skipped=1) 2022-11-23T01:46:56.1375452Z 2022-11-23T01:46:56.1375573Z Generating XML reports... 2022-11-23T01:46:56.1376022Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014416.xml 2022-11-23T01:46:56.1376492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1376673Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1377058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1377234Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1377268Z 2022-11-23T01:46:56.1377358Z Running tests... 2022-11-23T01:46:56.1377615Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1377928Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1378231Z test_reduce_sum_twice (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1378450Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 17966 2022-11-23T01:46:56.1378673Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 17967 2022-11-23T01:46:56.1379046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1379222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1379586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1379779Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1380152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1380331Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1380712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1380901Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1381147Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1381395Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1381783Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1382184Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1382417Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1382757Z STAGE:2022-11-23 01:44:22 17967:17967 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1382990Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1383322Z STAGE:2022-11-23 01:44:22 17966:17966 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1383600Z [1669167862.405061] [08317a7e7676:17967:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1383828Z [1669167864.056939] [08317a7e7676:17967:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1384065Z [1669167864.056939] [08317a7e7676:17967:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1384337Z [1669167862.398343] [08317a7e7676:17966:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1384552Z [1669167864.083035] [08317a7e7676:17966:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1384834Z [1669167864.083035] [08317a7e7676:17966:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1385396Z STAGE:2022-11-23 01:44:24 17967:17967 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:44:24 17966:17966 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1385417Z 2022-11-23T01:46:56.1385768Z STAGE:2022-11-23 01:44:24 17966:17966 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1386117Z STAGE:2022-11-23 01:44:24 17967:17967 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1386446Z STAGE:2022-11-23 01:44:24 17967:17967 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1386820Z STAGE:2022-11-23 01:44:24 17966:17966 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1387157Z STAGE:2022-11-23 01:44:24 17967:17967 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1387489Z STAGE:2022-11-23 01:44:24 17966:17966 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1387836Z STAGE:2022-11-23 01:44:24 17967:17967 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1388166Z STAGE:2022-11-23 01:44:24 17966:17966 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1388268Z ok (6.642s) 2022-11-23T01:46:56.1388287Z 2022-11-23T01:46:56.1388553Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1388663Z Ran 1 test in 6.642s 2022-11-23T01:46:56.1388684Z 2022-11-23T01:46:56.1388775Z OK 2022-11-23T01:46:56.1388794Z 2022-11-23T01:46:56.1388918Z Generating XML reports... 2022-11-23T01:46:56.1389529Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014418.xml 2022-11-23T01:46:56.1389906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1390086Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1390456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1390648Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1390668Z 2022-11-23T01:46:56.1390775Z Running tests... 2022-11-23T01:46:56.1391037Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1391348Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1391611Z test_scatter (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:46:56.1391634Z 2022-11-23T01:46:56.1391891Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1392004Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1392023Z 2022-11-23T01:46:56.1392112Z OK (skipped=1) 2022-11-23T01:46:56.1392151Z 2022-11-23T01:46:56.1392258Z Generating XML reports... 2022-11-23T01:46:56.1392708Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014427.xml 2022-11-23T01:46:56.1393087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1393261Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1393639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1393830Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1393853Z 2022-11-23T01:46:56.1393962Z Running tests... 2022-11-23T01:46:56.1394221Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1394518Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1394864Z test_scatter_checks (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:46:56.1394886Z 2022-11-23T01:46:56.1395156Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1395268Z Ran 1 test in 0.003s 2022-11-23T01:46:56.1395287Z 2022-11-23T01:46:56.1395392Z OK (skipped=1) 2022-11-23T01:46:56.1395412Z 2022-11-23T01:46:56.1395534Z Generating XML reports... 2022-11-23T01:46:56.1395983Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014430.xml 2022-11-23T01:46:56.1396358Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1396597Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1396967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1397161Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1397181Z 2022-11-23T01:46:56.1397290Z Running tests... 2022-11-23T01:46:56.1397547Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1397858Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1398128Z test_scatter_complex (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:46:56.1398149Z 2022-11-23T01:46:56.1398408Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1398516Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1398539Z 2022-11-23T01:46:56.1398628Z OK (skipped=1) 2022-11-23T01:46:56.1398665Z 2022-11-23T01:46:56.1398770Z Generating XML reports... 2022-11-23T01:46:56.1399221Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014432.xml 2022-11-23T01:46:56.1399599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1399773Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1400156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1400349Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1400368Z 2022-11-23T01:46:56.1400478Z Running tests... 2022-11-23T01:46:56.1400735Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1401037Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1401292Z test_scatter_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA gather (0.002s) 2022-11-23T01:46:56.1401311Z 2022-11-23T01:46:56.1401572Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1401683Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1401702Z 2022-11-23T01:46:56.1401807Z OK (skipped=1) 2022-11-23T01:46:56.1401826Z 2022-11-23T01:46:56.1401948Z Generating XML reports... 2022-11-23T01:46:56.1402396Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014434.xml 2022-11-23T01:46:56.1402769Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1402941Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1403304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1403502Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1403521Z 2022-11-23T01:46:56.1403628Z Running tests... 2022-11-23T01:46:56.1403933Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1404253Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1404521Z test_scatter_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA gather (0.002s) 2022-11-23T01:46:56.1404541Z 2022-11-23T01:46:56.1404801Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1404909Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1404929Z 2022-11-23T01:46:56.1405034Z OK (skipped=1) 2022-11-23T01:46:56.1405054Z 2022-11-23T01:46:56.1405159Z Generating XML reports... 2022-11-23T01:46:56.1405611Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014437.xml 2022-11-23T01:46:56.1406034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1406211Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1406596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1406792Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1406812Z 2022-11-23T01:46:56.1406917Z Running tests... 2022-11-23T01:46:56.1407179Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1407477Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1407801Z test_scatter_full_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:46:56.1407825Z 2022-11-23T01:46:56.1408089Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1408199Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1408219Z 2022-11-23T01:46:56.1408324Z OK (skipped=1) 2022-11-23T01:46:56.1408344Z 2022-11-23T01:46:56.1408470Z Generating XML reports... 2022-11-23T01:46:56.1408920Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014439.xml 2022-11-23T01:46:56.1409297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1409470Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1409835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1410027Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1410047Z 2022-11-23T01:46:56.1410158Z Running tests... 2022-11-23T01:46:56.1410417Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1410730Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1411002Z test_scatter_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T01:46:56.1411023Z 2022-11-23T01:46:56.1411317Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1411430Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1411450Z 2022-11-23T01:46:56.1411557Z OK (skipped=1) 2022-11-23T01:46:56.1411576Z 2022-11-23T01:46:56.1411680Z Generating XML reports... 2022-11-23T01:46:56.1412129Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014442.xml 2022-11-23T01:46:56.1412502Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1412681Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1413059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1413312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1413334Z 2022-11-23T01:46:56.1413447Z Running tests... 2022-11-23T01:46:56.1413709Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1414005Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1414397Z test_scatter_object_list (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T01:46:56.1414416Z 2022-11-23T01:46:56.1414671Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1414777Z Ran 1 test in 0.003s 2022-11-23T01:46:56.1414797Z 2022-11-23T01:46:56.1414951Z OK (skipped=1) 2022-11-23T01:46:56.1414971Z 2022-11-23T01:46:56.1415093Z Generating XML reports... 2022-11-23T01:46:56.1415541Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014444.xml 2022-11-23T01:46:56.1415917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1416093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1416459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1416652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1416671Z 2022-11-23T01:46:56.1416776Z Running tests... 2022-11-23T01:46:56.1417035Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1417344Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1417594Z test_send_recv (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1417818Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18344 2022-11-23T01:46:56.1418043Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18345 2022-11-23T01:46:56.1418419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1418578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1418962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1419151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1419522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1419698Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1420078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1420268Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1420517Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1420748Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1421151Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1421550Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1421780Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1422019Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1422291Z [1669167890.919145] [08317a7e7676:18344:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1422571Z [1669167892.333705] [08317a7e7676:18344:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1422817Z [1669167892.333705] [08317a7e7676:18344:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1423089Z [1669167890.922023] [08317a7e7676:18345:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1423321Z [1669167892.335098] [08317a7e7676:18345:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1423539Z [1669167892.335098] [08317a7e7676:18345:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1423689Z ok (6.111s) 2022-11-23T01:46:56.1423709Z 2022-11-23T01:46:56.1423975Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1424087Z Ran 1 test in 6.111s 2022-11-23T01:46:56.1424110Z 2022-11-23T01:46:56.1424205Z OK 2022-11-23T01:46:56.1424223Z 2022-11-23T01:46:56.1424345Z Generating XML reports... 2022-11-23T01:46:56.1424793Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014447.xml 2022-11-23T01:46:56.1425169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1425344Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1425712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1425908Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1425928Z 2022-11-23T01:46:56.1426035Z Running tests... 2022-11-23T01:46:56.1426293Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1426612Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1426900Z test_send_recv_any_source (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support send/recv from any source (0.002s) 2022-11-23T01:46:56.1426920Z 2022-11-23T01:46:56.1427181Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1427290Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1427309Z 2022-11-23T01:46:56.1427399Z OK (skipped=1) 2022-11-23T01:46:56.1427435Z 2022-11-23T01:46:56.1427540Z Generating XML reports... 2022-11-23T01:46:56.1427985Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014455.xml 2022-11-23T01:46:56.1428365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1428541Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1428925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1429348Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1429368Z 2022-11-23T01:46:56.1429475Z Running tests... 2022-11-23T01:46:56.1429741Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1430035Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1430347Z test_send_recv_any_source_autograd_profiler (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support send/recv from any source (0.002s) 2022-11-23T01:46:56.1430367Z 2022-11-23T01:46:56.1430632Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1430741Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1430761Z 2022-11-23T01:46:56.1430865Z OK (skipped=1) 2022-11-23T01:46:56.1430884Z 2022-11-23T01:46:56.1431005Z Generating XML reports... 2022-11-23T01:46:56.1431525Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014458.xml 2022-11-23T01:46:56.1431911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1432084Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1432453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1432646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1432665Z 2022-11-23T01:46:56.1432772Z Running tests... 2022-11-23T01:46:56.1433088Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1433398Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1433706Z test_send_recv_any_source_torch_profiler (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support send/recv from any source (0.002s) 2022-11-23T01:46:56.1433728Z 2022-11-23T01:46:56.1433987Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1434096Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1434115Z 2022-11-23T01:46:56.1434220Z OK (skipped=1) 2022-11-23T01:46:56.1434240Z 2022-11-23T01:46:56.1434346Z Generating XML reports... 2022-11-23T01:46:56.1434793Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014500.xml 2022-11-23T01:46:56.1435167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1435345Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1435728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1435918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1435938Z 2022-11-23T01:46:56.1436042Z Running tests... 2022-11-23T01:46:56.1436305Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1436603Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1436879Z test_send_recv_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1437097Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18553 2022-11-23T01:46:56.1437314Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18554 2022-11-23T01:46:56.1437689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1437866Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1438249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1438439Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1438811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1438969Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1439346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1439535Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1439778Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1440029Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1440479Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1440888Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1441118Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1441459Z STAGE:2022-11-23 01:45:06 18553:18553 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1441677Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1442010Z STAGE:2022-11-23 01:45:06 18554:18554 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1442281Z [1669167906.928757] [08317a7e7676:18554:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1442564Z [1669167908.543081] [08317a7e7676:18554:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1442801Z [1669167908.543081] [08317a7e7676:18554:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1443150Z STAGE:2022-11-23 01:45:08 18554:18554 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1443426Z [1669167906.908244] [08317a7e7676:18553:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1443657Z [1669167908.535751] [08317a7e7676:18553:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1443895Z [1669167908.535751] [08317a7e7676:18553:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1444242Z STAGE:2022-11-23 01:45:08 18553:18553 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1444578Z STAGE:2022-11-23 01:45:08 18554:18554 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1444927Z STAGE:2022-11-23 01:45:08 18553:18553 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1445028Z ok (6.681s) 2022-11-23T01:46:56.1445048Z 2022-11-23T01:46:56.1445310Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1445422Z Ran 1 test in 6.681s 2022-11-23T01:46:56.1445442Z 2022-11-23T01:46:56.1445530Z OK 2022-11-23T01:46:56.1445549Z 2022-11-23T01:46:56.1445673Z Generating XML reports... 2022-11-23T01:46:56.1446121Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014502.xml 2022-11-23T01:46:56.1446487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1446664Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1447045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1447234Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1447254Z 2022-11-23T01:46:56.1447359Z Running tests... 2022-11-23T01:46:56.1447624Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1447938Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1448175Z test_send_recv_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Send Recv Only (0.002s) 2022-11-23T01:46:56.1448195Z 2022-11-23T01:46:56.1448456Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1448554Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1448574Z 2022-11-23T01:46:56.1448683Z OK (skipped=1) 2022-11-23T01:46:56.1448702Z 2022-11-23T01:46:56.1448824Z Generating XML reports... 2022-11-23T01:46:56.1449321Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014512.xml 2022-11-23T01:46:56.1449700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1449874Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1450259Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1450450Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1450470Z 2022-11-23T01:46:56.1450573Z Running tests... 2022-11-23T01:46:56.1450817Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1451175Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1451448Z test_send_recv_nccl_autograd_profiler (__main__.TestDistBackendWithSpawn) ... skip: NCCL Send Recv Only (0.002s) 2022-11-23T01:46:56.1451469Z 2022-11-23T01:46:56.1451735Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1451848Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1451867Z 2022-11-23T01:46:56.1451974Z OK (skipped=1) 2022-11-23T01:46:56.1451993Z 2022-11-23T01:46:56.1452116Z Generating XML reports... 2022-11-23T01:46:56.1452564Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014514.xml 2022-11-23T01:46:56.1452922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1453095Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1453488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1453678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1453697Z 2022-11-23T01:46:56.1453803Z Running tests... 2022-11-23T01:46:56.1454067Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1454383Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1454647Z test_send_recv_nccl_torch_profiler (__main__.TestDistBackendWithSpawn) ... skip: NCCL Send Recv Only (0.002s) 2022-11-23T01:46:56.1454667Z 2022-11-23T01:46:56.1454927Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1455021Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1455040Z 2022-11-23T01:46:56.1464057Z OK (skipped=1) 2022-11-23T01:46:56.1464087Z 2022-11-23T01:46:56.1464255Z Generating XML reports... 2022-11-23T01:46:56.1464747Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014516.xml 2022-11-23T01:46:56.1465132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1465317Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1465709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1465900Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1465920Z 2022-11-23T01:46:56.1466028Z Running tests... 2022-11-23T01:46:56.1466295Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1466612Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1466868Z test_send_recv_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1467098Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18766 2022-11-23T01:46:56.1467314Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18767 2022-11-23T01:46:56.1467776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1467957Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1468345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1468533Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1468897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1469320Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1469702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1469993Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1470246Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1470494Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1470898Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1471301Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1471535Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1471873Z STAGE:2022-11-23 01:45:23 18767:18767 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1472108Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1472426Z STAGE:2022-11-23 01:45:23 18766:18766 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1472706Z [1669167923.334631] [08317a7e7676:18767:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1472935Z [1669167924.955058] [08317a7e7676:18767:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1473171Z [1669167924.955058] [08317a7e7676:18767:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1473511Z STAGE:2022-11-23 01:45:25 18767:18767 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1473781Z [1669167923.313168] [08317a7e7676:18766:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1474006Z [1669167924.996824] [08317a7e7676:18766:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1474238Z [1669167924.996824] [08317a7e7676:18766:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1474577Z STAGE:2022-11-23 01:45:25 18766:18766 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1474927Z STAGE:2022-11-23 01:45:25 18767:18767 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1475261Z STAGE:2022-11-23 01:45:25 18766:18766 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1475364Z ok (6.786s) 2022-11-23T01:46:56.1475384Z 2022-11-23T01:46:56.1475643Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1475751Z Ran 1 test in 6.786s 2022-11-23T01:46:56.1475774Z 2022-11-23T01:46:56.1475857Z OK 2022-11-23T01:46:56.1475876Z 2022-11-23T01:46:56.1475997Z Generating XML reports... 2022-11-23T01:46:56.1476448Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014519.xml 2022-11-23T01:46:56.1476890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1477060Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1477448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1477642Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1477662Z 2022-11-23T01:46:56.1477770Z Running tests... 2022-11-23T01:46:56.1478032Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1478346Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1478658Z test_send_recv_with_tag (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1478877Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18880 2022-11-23T01:46:56.1479086Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18881 2022-11-23T01:46:56.1479462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1479638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1480020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1480212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1480582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1480761Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1481140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1481337Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1481567Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1481811Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1482212Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1482613Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1482844Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1483074Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1483344Z [1669167932.629173] [08317a7e7676:18881:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1483578Z [1669167934.025688] [08317a7e7676:18881:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1483818Z [1669167934.025688] [08317a7e7676:18881:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1484088Z [1669167932.605877] [08317a7e7676:18880:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1484302Z [1669167934.008387] [08317a7e7676:18880:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1484536Z [1669167934.008387] [08317a7e7676:18880:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1484641Z ok (6.076s) 2022-11-23T01:46:56.1484662Z 2022-11-23T01:46:56.1484932Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1485041Z Ran 1 test in 6.077s 2022-11-23T01:46:56.1485124Z 2022-11-23T01:46:56.1485218Z OK 2022-11-23T01:46:56.1485237Z 2022-11-23T01:46:56.1485360Z Generating XML reports... 2022-11-23T01:46:56.1485815Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014528.xml 2022-11-23T01:46:56.1486191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1486352Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1486736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1486979Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1486999Z 2022-11-23T01:46:56.1487105Z Running tests... 2022-11-23T01:46:56.1487370Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1487690Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1487976Z test_send_recv_with_tag_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1488197Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 18990 2022-11-23T01:46:56.1488400Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 18991 2022-11-23T01:46:56.1488778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1488952Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1489338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1489521Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1489891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1490060Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1490442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1490630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1490860Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1491110Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1491513Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1491918Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1492153Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1492495Z STAGE:2022-11-23 01:45:41 18991:18991 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1492727Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1493060Z STAGE:2022-11-23 01:45:41 18990:18990 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1493331Z [1669167941.293059] [08317a7e7676:18990:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1493546Z [1669167942.964946] [08317a7e7676:18990:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1493784Z [1669167942.964946] [08317a7e7676:18990:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1494176Z STAGE:2022-11-23 01:45:43 18990:18990 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1494451Z [1669167941.314571] [08317a7e7676:18991:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1494678Z [1669167942.928474] [08317a7e7676:18991:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1494914Z [1669167942.928474] [08317a7e7676:18991:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1495262Z STAGE:2022-11-23 01:45:43 18991:18991 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1495666Z STAGE:2022-11-23 01:45:43 18990:18990 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1496015Z STAGE:2022-11-23 01:45:43 18991:18991 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1496099Z ok (6.754s) 2022-11-23T01:46:56.1496135Z 2022-11-23T01:46:56.1496388Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1496499Z Ran 1 test in 6.754s 2022-11-23T01:46:56.1496519Z 2022-11-23T01:46:56.1496607Z OK 2022-11-23T01:46:56.1496626Z 2022-11-23T01:46:56.1496745Z Generating XML reports... 2022-11-23T01:46:56.1497192Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014537.xml 2022-11-23T01:46:56.1497566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1497741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1498131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1498309Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1498329Z 2022-11-23T01:46:56.1498436Z Running tests... 2022-11-23T01:46:56.1498704Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1499017Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1499302Z test_send_recv_with_tag_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1499524Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19104 2022-11-23T01:46:56.1499744Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19105 2022-11-23T01:46:56.1500119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1500284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1500667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1500861Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1501235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1501411Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1501791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1501978Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1502224Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1502470Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1502862Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1503306Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1503543Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1503886Z STAGE:2022-11-23 01:45:50 19105:19105 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1504112Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1504438Z STAGE:2022-11-23 01:45:50 19104:19104 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T01:46:56.1504717Z [1669167950.589205] [08317a7e7676:19104:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1504993Z [1669167952.242946] [08317a7e7676:19104:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1505234Z [1669167952.242946] [08317a7e7676:19104:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1505561Z STAGE:2022-11-23 01:45:52 19104:19104 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1505828Z [1669167950.589164] [08317a7e7676:19105:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1506058Z [1669167952.237822] [08317a7e7676:19105:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1506292Z [1669167952.237822] [08317a7e7676:19105:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1506637Z STAGE:2022-11-23 01:45:52 19105:19105 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T01:46:56.1506990Z STAGE:2022-11-23 01:45:52 19104:19104 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1507340Z STAGE:2022-11-23 01:45:52 19105:19105 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T01:46:56.1507440Z ok (6.755s) 2022-11-23T01:46:56.1507460Z 2022-11-23T01:46:56.1507727Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1507840Z Ran 1 test in 6.756s 2022-11-23T01:46:56.1507860Z 2022-11-23T01:46:56.1507933Z OK 2022-11-23T01:46:56.1507952Z 2022-11-23T01:46:56.1508075Z Generating XML reports... 2022-11-23T01:46:56.1508529Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014546.xml 2022-11-23T01:46:56.1508907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1509295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1509686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1509879Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1509900Z 2022-11-23T01:46:56.1510006Z Running tests... 2022-11-23T01:46:56.1510255Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1510564Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1510848Z test_sparse_all_reduce_sum (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo backend support sparse all reduce (0.002s) 2022-11-23T01:46:56.1510868Z 2022-11-23T01:46:56.1511122Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1511236Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1511292Z 2022-11-23T01:46:56.1511398Z OK (skipped=1) 2022-11-23T01:46:56.1511417Z 2022-11-23T01:46:56.1511541Z Generating XML reports... 2022-11-23T01:46:56.1511991Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014555.xml 2022-11-23T01:46:56.1512432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1512600Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1512981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1513169Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1513189Z 2022-11-23T01:46:56.1513297Z Running tests... 2022-11-23T01:46:56.1513556Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1513869Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1514227Z test_sparse_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo backend support sparse all reduce (0.002s) 2022-11-23T01:46:56.1514247Z 2022-11-23T01:46:56.1514514Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1514626Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1514645Z 2022-11-23T01:46:56.1514736Z OK (skipped=1) 2022-11-23T01:46:56.1514755Z 2022-11-23T01:46:56.1514875Z Generating XML reports... 2022-11-23T01:46:56.1515323Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014558.xml 2022-11-23T01:46:56.1515697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1515872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1516254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1516451Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1516470Z 2022-11-23T01:46:56.1516576Z Running tests... 2022-11-23T01:46:56.1516821Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1517136Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1517402Z test_stateless_api_with_ddp (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1517625Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19284 2022-11-23T01:46:56.1517840Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19285 2022-11-23T01:46:56.1518215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1518388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1518776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1518967Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1519322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1519490Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1519863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1520049Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1520293Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1520539Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1520946Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1521388Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1521625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1521839Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1522108Z [1669167965.998306] [08317a7e7676:19285:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1522339Z [1669167966.011711] [08317a7e7676:19285:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1522576Z [1669167966.011711] [08317a7e7676:19285:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1522943Z [1669167965.997060] [08317a7e7676:19284:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1523171Z [1669167966.010943] [08317a7e7676:19284:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1523404Z [1669167966.010943] [08317a7e7676:19284:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1523504Z ok (6.623s) 2022-11-23T01:46:56.1523524Z 2022-11-23T01:46:56.1523792Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1523887Z Ran 1 test in 6.623s 2022-11-23T01:46:56.1523925Z 2022-11-23T01:46:56.1523998Z OK 2022-11-23T01:46:56.1524018Z 2022-11-23T01:46:56.1524141Z Generating XML reports... 2022-11-23T01:46:56.1524589Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014600.xml 2022-11-23T01:46:56.1524969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1525143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1525527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1525719Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1525739Z 2022-11-23T01:46:56.1525844Z Running tests... 2022-11-23T01:46:56.1526090Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1526401Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1526666Z test_static_graph_api_cpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1526887Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19402 2022-11-23T01:46:56.1527111Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19403 2022-11-23T01:46:56.1527486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1527666Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1528053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1528227Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1528599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1528771Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1529157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1529350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1529597Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1529887Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1530294Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1530697Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1530913Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1531173Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbcartvr_ 2022-11-23T01:46:56.1531441Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbcartvr_/_remote_module_non_scriptable.py 2022-11-23T01:46:56.1531713Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1531968Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmkop_rq2 2022-11-23T01:46:56.1532240Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmkop_rq2/_remote_module_non_scriptable.py 2022-11-23T01:46:56.1532514Z [1669167973.816501] [08317a7e7676:19403:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1532743Z [1669167975.198824] [08317a7e7676:19403:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1532977Z [1669167975.198824] [08317a7e7676:19403:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1533243Z [1669167973.793859] [08317a7e7676:19402:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1533458Z [1669167975.238199] [08317a7e7676:19402:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1533698Z [1669167975.238199] [08317a7e7676:19402:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1533801Z ok (6.226s) 2022-11-23T01:46:56.1533821Z 2022-11-23T01:46:56.1534090Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1534199Z Ran 1 test in 6.226s 2022-11-23T01:46:56.1534219Z 2022-11-23T01:46:56.1534309Z OK 2022-11-23T01:46:56.1534329Z 2022-11-23T01:46:56.1534446Z Generating XML reports... 2022-11-23T01:46:56.1534894Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014609.xml 2022-11-23T01:46:56.1535256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1535433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1535817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1536013Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1536033Z 2022-11-23T01:46:56.1536142Z Running tests... 2022-11-23T01:46:56.1536408Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1536723Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1537028Z test_sync_bn_logged (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl & Gloo backend support DistributedDataParallel (0.002s) 2022-11-23T01:46:56.1537048Z 2022-11-23T01:46:56.1537309Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1537403Z Ran 1 test in 0.002s 2022-11-23T01:46:56.1537442Z 2022-11-23T01:46:56.1537532Z OK (skipped=1) 2022-11-23T01:46:56.1537551Z 2022-11-23T01:46:56.1537673Z Generating XML reports... 2022-11-23T01:46:56.1538122Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014618.xml 2022-11-23T01:46:56.1538544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1538724Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1539109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1539303Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1539323Z 2022-11-23T01:46:56.1539430Z Running tests... 2022-11-23T01:46:56.1539675Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1540035Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1540329Z test_undefined_grad_parity_unused_parameters (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1540553Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19549 2022-11-23T01:46:56.1540772Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19550 2022-11-23T01:46:56.1541146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1541323Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1541703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1541877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1542246Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1542425Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1542802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1542994Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1543238Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1543483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1543889Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1544291Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1544508Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1544743Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1544999Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3fbg6qby 2022-11-23T01:46:56.1545274Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3fbg6qby/_remote_module_non_scriptable.py 2022-11-23T01:46:56.1545532Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn_159zpe 2022-11-23T01:46:56.1545801Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn_159zpe/_remote_module_non_scriptable.py 2022-11-23T01:46:56.1546078Z [1669167986.301055] [08317a7e7676:19549:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1546308Z [1669167986.314742] [08317a7e7676:19549:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1546552Z [1669167986.314742] [08317a7e7676:19549:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1547409Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T01:46:56.1547695Z [1669167986.305247] [08317a7e7676:19550:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1547925Z [1669167986.318376] [08317a7e7676:19550:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1548196Z [1669167986.318376] [08317a7e7676:19550:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1549200Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T01:46:56.1549312Z ok (6.651s) 2022-11-23T01:46:56.1549334Z 2022-11-23T01:46:56.1549611Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1549724Z Ran 1 test in 6.651s 2022-11-23T01:46:56.1549748Z 2022-11-23T01:46:56.1549840Z OK 2022-11-23T01:46:56.1549859Z 2022-11-23T01:46:56.1549982Z Generating XML reports... 2022-11-23T01:46:56.1550437Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014621.xml 2022-11-23T01:46:56.1550818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1550994Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1551366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1551554Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1551574Z 2022-11-23T01:46:56.1551682Z Running tests... 2022-11-23T01:46:56.1551944Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1552255Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1552545Z test_verify_model_across_rank_with_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1552767Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19667 2022-11-23T01:46:56.1552988Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19668 2022-11-23T01:46:56.1553363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1553523Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1553908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1554097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1554466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1554643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1555022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1555284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1555538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1555770Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1556178Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1556576Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1556809Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1557097Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1557339Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.1557585Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.1557983Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1558375Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1558600Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T01:46:56.1558843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T01:46:56.1559232Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:46:56.1559630Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:46:56.1559894Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpciu60yov 2022-11-23T01:46:56.1560168Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpciu60yov/_remote_module_non_scriptable.py 2022-11-23T01:46:56.1560422Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3nudwegz 2022-11-23T01:46:56.1560691Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3nudwegz/_remote_module_non_scriptable.py 2022-11-23T01:46:56.1560966Z [1669167995.554723] [08317a7e7676:19668:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1561197Z [1669167995.568221] [08317a7e7676:19668:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1561423Z [1669167995.568221] [08317a7e7676:19668:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1561811Z [1669168000.930087] [08317a7e7676:19668:0] tag_match.c:62 UCX WARN unexpected tag-receive descriptor 0x55d8598a95c0 was not matched 2022-11-23T01:46:56.1562083Z [1669167995.547185] [08317a7e7676:19667:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1562314Z [1669167995.560685] [08317a7e7676:19667:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1562548Z [1669167995.560685] [08317a7e7676:19667:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1562864Z [1669168000.898684] [08317a7e7676:19667:1] ucc_schedule.h:189 UCC WARN timeout 5 sec. has expired on req 0x55b429be6340, seq_num 5, TL_UCP, team_id 1, size 2, rank 0, ctx_rank 0: Barrier n/a inplace=0 bytes=0 2022-11-23T01:46:56.1563189Z [1669168000.930021] [08317a7e7676:19667:0] mpool.c:55 UCX WARN object 0x55b429cf7840 {flags:0x20040 recv length 0 host memory} was not returned to mpool ucp_requests 2022-11-23T01:46:56.1563296Z ok (11.225s) 2022-11-23T01:46:56.1563316Z 2022-11-23T01:46:56.1563583Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1563696Z Ran 1 test in 11.226s 2022-11-23T01:46:56.1563715Z 2022-11-23T01:46:56.1563789Z OK 2022-11-23T01:46:56.1563807Z 2022-11-23T01:46:56.1563932Z Generating XML reports... 2022-11-23T01:46:56.1564383Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014630.xml 2022-11-23T01:46:56.1564759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1564977Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1565360Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1565556Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1565577Z 2022-11-23T01:46:56.1565683Z Running tests... 2022-11-23T01:46:56.1565946Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1566247Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T01:46:56.1566533Z test_verify_model_across_rank_without_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T01:46:56.1566755Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19787 2022-11-23T01:46:56.1566976Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19788 2022-11-23T01:46:56.1567357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1567535Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1567920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1568111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1568464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T01:46:56.1568635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T01:46:56.1569018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T01:46:56.1569204Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T01:46:56.1569451Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T01:46:56.1569704Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T01:46:56.1570110Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1570510Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T01:46:56.1570742Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T01:46:56.1570958Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T01:46:56.1571200Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T01:46:56.1571440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T01:46:56.1571838Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1572229Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T01:46:56.1572516Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T01:46:56.1572764Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T01:46:56.1573157Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:46:56.1573546Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T01:46:56.1573791Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm8zahlbp 2022-11-23T01:46:56.1574068Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm8zahlbp/_remote_module_non_scriptable.py 2022-11-23T01:46:56.1574367Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7onetvuy 2022-11-23T01:46:56.1574635Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7onetvuy/_remote_module_non_scriptable.py 2022-11-23T01:46:56.1574914Z [1669168009.330401] [08317a7e7676:19787:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1575149Z [1669168009.344124] [08317a7e7676:19787:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1575386Z [1669168009.344124] [08317a7e7676:19787:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1575698Z [1669168014.676110] [08317a7e7676:19787:1] ucc_schedule.h:189 UCC WARN timeout 5 sec. has expired on req 0x55834659c880, seq_num 5, TL_UCP, team_id 1, size 2, rank 0, ctx_rank 0: Barrier n/a inplace=0 bytes=0 2022-11-23T01:46:56.1575982Z [1669168014.707253] [08317a7e7676:19787:0] mpool.c:55 UCX WARN object 0x5583466adc80 {flags:0x20040 recv length 0 host memory} was not returned to mpool ucp_requests 2022-11-23T01:46:56.1576254Z [1669168009.332912] [08317a7e7676:19788:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T01:46:56.1576481Z [1669168009.346699] [08317a7e7676:19788:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T01:46:56.1576699Z [1669168009.346699] [08317a7e7676:19788:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T01:46:56.1577088Z [1669168014.717226] [08317a7e7676:19788:0] tag_match.c:62 UCX WARN unexpected tag-receive descriptor 0x55db7808a400 was not matched 2022-11-23T01:46:56.1577193Z ok (11.246s) 2022-11-23T01:46:56.1577213Z 2022-11-23T01:46:56.1577485Z ---------------------------------------------------------------------- 2022-11-23T01:46:56.1577599Z Ran 1 test in 11.246s 2022-11-23T01:46:56.1577619Z 2022-11-23T01:46:56.1577708Z OK 2022-11-23T01:46:56.1577727Z 2022-11-23T01:46:56.1577849Z Generating XML reports... 2022-11-23T01:46:56.1578300Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014643.xml 2022-11-23T01:46:56.1578320Z 2022-11-23T01:46:56.1578744Z ##[endgroup] 2022-11-23T01:46:56.1579214Z FINISHED PRINTING LOG FILE of distributed/test_distributed_spawn (/var/lib/jenkins/workspace/test/test-reports/distributed-test_distributed_spawn_ceaarsv3) 2022-11-23T01:46:56.1579235Z 2022-11-23T01:46:56.1579448Z Running distributed tests for the ucc backend with file init_method in shard 3 of 3 2022-11-23T01:46:56.1579961Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 01:46:55.923929] 2022-11-23T02:12:26.5573704Z 2022-11-23T02:12:26.5574205Z Expand the folded group to see the log file of distributed/test_distributed_spawn 2022-11-23T02:12:26.5577836Z ##[group]PRINTING LOG FILE of distributed/test_distributed_spawn (/var/lib/jenkins/workspace/test/test-reports/distributed-test_distributed_spawn_3vkyzowr) 2022-11-23T02:12:26.5578596Z 2022-11-23T02:12:26.5640936Z , <__main__.TestDistBackendWithSpawn testMethod=test_3_level_hierarchical_model_averager>, <__main__.TestDistBackendWithSpawn testMethod=test_Backend_enum_class>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallelCPU>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallelCPU_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_2D_Input>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Channels_Last>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_No_Affine>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_non_default_stream>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_requires_grad>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedDataParallel_with_amp_and_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_DistributedSampler_padding>, <__main__.TestDistBackendWithSpawn testMethod=test_SyncBatchNorm_process_group>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync_allreduce_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync_allreduce_with_then_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_accumulate_gradients_no_sync_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_simple>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_coalesced_with_empty>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_into_cat_tensor_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_into_stack_tensor_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_multigpu_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_object_default_pg>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_object_subgroup>, <__main__.TestDistBackendWithSpawn testMethod=test_all_gather_v_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_full_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_max_complex_unsupported>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_coalesced_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_complex_unsupported_ops>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_full_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_max>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_min>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_multigpu_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_product>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_result_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_async>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_cuda_async>, <__main__.TestDistBackendWithSpawn testMethod=test_all_reduce_sum_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_equal_split_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_group>, <__main__.TestDistBackendWithSpawn testMethod=test_all_to_all_single_unequal_split_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_average_parameters>, <__main__.TestDistBackendWithSpawn testMethod=test_backend_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_backend_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_full_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_group_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_timeout_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_timeout_global>, <__main__.TestDistBackendWithSpawn testMethod=test_barrier_timeout_group>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_gloo>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_gloo_tags>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_mixed_backend_err>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_no_rank_zero_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_op_err>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_op_list_err>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_ring_exchange_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_self_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_batch_isend_irecv_tensor_err>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_group>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_broadcast_object_list>, <__main__.TestDistBackendWithSpawn testMethod=test_compute_bucket_assignment_by_size_sparse_error_with_logger>, <__main__.TestDistBackendWithSpawn testMethod=test_compute_bucket_assignment_by_size_sparse_error_without_logger>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_broadcast_buffer>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_broadcast_buffer_via_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_buffer_hook_allreduce>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_buffer_hook_allreduce_return_future>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_build_debug_param_to_name_mapping>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_build_debug_param_to_name_mapping_requires_grad>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_comm_hook_logging>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_control_flow_different_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_control_flow_same_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_create_graph>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_device>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_forward_backward_hook>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_grad_div_uneven_inputs>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_allreduce>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_allreduce_process_group>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_post_localSGD>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_parity_powerSGD>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_pickling_powerSGD>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adam_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adam_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_False>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_True>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_ignore_params_arg>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_inference>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_join_model_equivalence>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_logging_data_cpu>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_logging_data_gpu>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_model_diff_num_params_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_model_diff_shape_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_multiple_nested_unused_params_err_ignore_params>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_multiple_nested_unused_params_error>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_namedtuple>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_new_tensor_in_fwd>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_new_tensor_in_fwd_static_graph>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_profiling_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_profiling_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_python_error_logged>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_returns_tensor_with_no_grad>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_shared_grad_acc_unused_params>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_static_graph_nested_types>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_sync_bn_training_vs_eval>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_sync_module_states>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_input_exception>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_input_join_disable>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_inputs>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_uneven_inputs_stop_iteration_sync_bn>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_unused_params_rebuild_buckets_exception>, <__main__.TestDistBackendWithSpawn testMethod=test_ddp_zero_output_features>, <__main__.TestDistBackendWithSpawn testMethod=test_destroy_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_destroy_group>, <__main__.TestDistBackendWithSpawn testMethod=test_detect_ddp_is_actually_static>, <__main__.TestDistBackendWithSpawn testMethod=test_different_graph_across_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_dump_DDP_relevant_env_vars>, <__main__.TestDistBackendWithSpawn testMethod=test_gather>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_checks>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_group>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_object>, <__main__.TestDistBackendWithSpawn testMethod=test_gather_object_subgroup>, <__main__.TestDistBackendWithSpawn testMethod=test_get_backend>, <__main__.TestDistBackendWithSpawn testMethod=test_get_future>, <__main__.TestDistBackendWithSpawn testMethod=test_get_rank>, <__main__.TestDistBackendWithSpawn testMethod=test_get_rank_size_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_get_rank_size_group>, <__main__.TestDistBackendWithSpawn testMethod=test_invalid_static_graph>, <__main__.TestDistBackendWithSpawn testMethod=test_irecv>, <__main__.TestDistBackendWithSpawn testMethod=test_isend>, <__main__.TestDistBackendWithSpawn testMethod=test_isend_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_isend_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_allreduce_hang>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_allreduce_hang_wait_all_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_failure_order>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_gloo>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_gloo_rank_0_timeout>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_gloo_subgroup>, <__main__.TestDistBackendWithSpawn testMethod=test_monitored_barrier_wait_all_ranks>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_allgather>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_allreduce>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_broadcast>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_backend_bool_reduce>, <__main__.TestDistBackendWithSpawn testMethod=test_nccl_high_priority_stream>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_by_enumeration>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_by_enumeration_input_rank_exceeds_world_size>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_by_enumeration_negative_input_rank>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_group_size_exceeds_world_size>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_overlap_not_allowed>, <__main__.TestDistBackendWithSpawn testMethod=test_new_subgroups_world_size_not_divisible_by_group_size>, <__main__.TestDistBackendWithSpawn testMethod=test_output_unused_in_loss_dict_module>, <__main__.TestDistBackendWithSpawn testMethod=test_output_unused_in_loss_tuple_module>, <__main__.TestDistBackendWithSpawn testMethod=test_periodic_model_averager>, <__main__.TestDistBackendWithSpawn testMethod=test_periodic_model_averager_param_group>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity_with_hierarchical_sgd>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view>, <__main__.TestDistBackendWithSpawn testMethod=test_post_localSGD_optimizer_step_reload>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_full_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_max>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_min>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_product>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_group_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_max>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_min>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_multigpu>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_product>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_scatter_tensor_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_scatter_v_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum_cuda_twice>, <__main__.TestDistBackendWithSpawn testMethod=test_reduce_sum_twice>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_checks>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_cuda_complex>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_full_group>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_group>, <__main__.TestDistBackendWithSpawn testMethod=test_scatter_object_list>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_any_source>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_any_source_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_any_source_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_nccl>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_nccl_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_nccl_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_with_tag>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_with_tag_autograd_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_send_recv_with_tag_torch_profiler>, <__main__.TestDistBackendWithSpawn testMethod=test_sparse_all_reduce_sum>, <__main__.TestDistBackendWithSpawn testMethod=test_sparse_all_reduce_sum_cuda>, <__main__.TestDistBackendWithSpawn testMethod=test_stateless_api_with_ddp>, <__main__.TestDistBackendWithSpawn testMethod=test_static_graph_api_cpu>, <__main__.TestDistBackendWithSpawn testMethod=test_sync_bn_logged>, <__main__.TestDistBackendWithSpawn testMethod=test_undefined_grad_parity_unused_parameters>, <__main__.TestDistBackendWithSpawn testMethod=test_verify_model_across_rank_with_logger>, <__main__.TestDistBackendWithSpawn testMethod=test_verify_model_across_rank_without_logger>]> 2022-11-23T02:12:26.5691435Z test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5693614Z test_3_level_hierarchical_model_averager (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5694059Z test_Backend_enum_class (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5694490Z test_DistributedDataParallel (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5694931Z test_DistributedDataParallelCPU (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5695418Z test_DistributedDataParallelCPU_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5695930Z test_DistributedDataParallel_SyncBatchNorm (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5696440Z test_DistributedDataParallel_SyncBatchNorm_2D_Input (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5696951Z test_DistributedDataParallel_SyncBatchNorm_Channels_Last (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5697634Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5698229Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5699065Z test_DistributedDataParallel_SyncBatchNorm_No_Affine (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5699633Z test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5700166Z test_DistributedDataParallel_non_default_stream (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5700661Z test_DistributedDataParallel_requires_grad (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5701364Z test_DistributedDataParallel_with_amp_and_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5702071Z test_DistributedSampler_padding (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5702519Z test_SyncBatchNorm_process_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5702962Z test_accumulate_gradients_no_sync (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5703400Z test_accumulate_gradients_no_sync_allreduce_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5703905Z test_accumulate_gradients_no_sync_allreduce_with_then_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5704392Z test_accumulate_gradients_no_sync_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5704793Z test_all_gather (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5705204Z test_all_gather_coalesced_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5705655Z test_all_gather_coalesced_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5706093Z test_all_gather_coalesced_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5706507Z test_all_gather_coalesced_simple (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5706944Z test_all_gather_coalesced_with_empty (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5707363Z test_all_gather_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5707736Z test_all_gather_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5708136Z test_all_gather_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5708554Z test_all_gather_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5709467Z test_all_gather_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5709987Z test_all_gather_into_cat_tensor_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5710430Z test_all_gather_into_stack_tensor_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5710863Z test_all_gather_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5711264Z test_all_gather_multigpu_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5711694Z test_all_gather_object_default_pg (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5712126Z test_all_gather_object_subgroup (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5712511Z test_all_gather_v_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5712931Z test_all_reduce_coalesced_full_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5713383Z test_all_reduce_coalesced_full_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5713846Z test_all_reduce_coalesced_full_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5714287Z test_all_reduce_coalesced_full_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5714731Z test_all_reduce_coalesced_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5715170Z test_all_reduce_coalesced_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5715591Z test_all_reduce_coalesced_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5716035Z test_all_reduce_coalesced_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5716573Z test_all_reduce_coalesced_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5717145Z test_all_reduce_coalesced_max_complex_unsupported (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5717678Z test_all_reduce_coalesced_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5718103Z test_all_reduce_coalesced_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5718526Z test_all_reduce_coalesced_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5718936Z test_all_reduce_complex_unsupported_ops (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5719376Z test_all_reduce_full_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5719792Z test_all_reduce_full_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5720308Z test_all_reduce_full_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5720713Z test_all_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5721127Z test_all_reduce_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5721528Z test_all_reduce_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5721915Z test_all_reduce_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5722320Z test_all_reduce_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5722706Z test_all_reduce_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5723070Z test_all_reduce_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5723568Z test_all_reduce_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5724333Z test_all_reduce_multigpu_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5724759Z test_all_reduce_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5725150Z test_all_reduce_result_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5725546Z test_all_reduce_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5725934Z test_all_reduce_sum_async (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5726328Z test_all_reduce_sum_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5726730Z test_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5727137Z test_all_reduce_sum_cuda_async (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5727558Z test_all_reduce_sum_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5727930Z test_all_to_all (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5728314Z test_all_to_all_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5728703Z test_all_to_all_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5729079Z test_all_to_all_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5729491Z test_all_to_all_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5729903Z test_all_to_all_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5730289Z test_all_to_all_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5730686Z test_all_to_all_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5731102Z test_all_to_all_single_equal_split (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5731547Z test_all_to_all_single_equal_split_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5731972Z test_all_to_all_single_equal_split_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5732430Z test_all_to_all_single_equal_split_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5732896Z test_all_to_all_single_equal_split_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5733342Z test_all_to_all_single_equal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5733808Z test_all_to_all_single_equal_split_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5734256Z test_all_to_all_single_equal_split_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5734699Z test_all_to_all_single_unequal_split (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5735210Z test_all_to_all_single_unequal_split_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5735676Z test_all_to_all_single_unequal_split_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5736144Z test_all_to_all_single_unequal_split_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5736609Z test_all_to_all_single_unequal_split_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5737087Z test_all_to_all_single_unequal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5737559Z test_all_to_all_single_unequal_split_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5738073Z test_all_to_all_single_unequal_split_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5738490Z test_average_parameters (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5738893Z test_backend_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5739292Z test_backend_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5739656Z test_barrier (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5740027Z test_barrier_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5740419Z test_barrier_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5740822Z test_barrier_full_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5741200Z test_barrier_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5741590Z test_barrier_group_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5742000Z test_barrier_timeout_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5742399Z test_barrier_timeout_global (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5742810Z test_barrier_timeout_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5743210Z test_batch_isend_irecv_gloo (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5743609Z test_batch_isend_irecv_gloo_tags (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5744122Z test_batch_isend_irecv_mixed_backend_err (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5744695Z test_batch_isend_irecv_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5745125Z test_batch_isend_irecv_no_rank_zero_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5745528Z test_batch_isend_irecv_op_err (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5745958Z test_batch_isend_irecv_op_list_err (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5746396Z test_batch_isend_irecv_ring_exchange_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5746816Z test_batch_isend_irecv_self_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5747242Z test_batch_isend_irecv_tensor_err (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5747641Z test_broadcast (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5748021Z test_broadcast_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5748401Z test_broadcast_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5748801Z test_broadcast_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5749927Z test_broadcast_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5750321Z test_broadcast_object_list (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5750790Z test_compute_bucket_assignment_by_size_sparse_error_with_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5751316Z test_compute_bucket_assignment_by_size_sparse_error_without_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5752092Z test_ddp_broadcast_buffer (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5752511Z test_ddp_broadcast_buffer_via_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5752943Z test_ddp_buffer_hook_allreduce (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5753395Z test_ddp_buffer_hook_allreduce_return_future (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5753943Z test_ddp_build_debug_param_to_name_mapping (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5754442Z test_ddp_build_debug_param_to_name_mapping_requires_grad (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5754895Z test_ddp_comm_hook_logging (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5755303Z test_ddp_control_flow_different_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5755752Z test_ddp_control_flow_same_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5756173Z test_ddp_create_graph (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5756555Z test_ddp_device (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5756940Z test_ddp_forward_backward_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5757437Z test_ddp_grad_div_uneven_inputs (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5757861Z test_ddp_hook_parity_allreduce (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5758296Z test_ddp_hook_parity_allreduce_process_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5758747Z test_ddp_hook_parity_post_localSGD (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5759176Z test_ddp_hook_parity_powerSGD (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5759599Z test_ddp_hook_pickling_powerSGD (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5760056Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5760566Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5761147Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5761782Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5762393Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5763017Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5763634Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5764244Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5764834Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5765443Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5765999Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_False (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5766498Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_True (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5766932Z test_ddp_ignore_params_arg (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5767325Z test_ddp_inference (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5767731Z test_ddp_join_model_equivalence (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5768126Z test_ddp_logging_data_cpu (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5768524Z test_ddp_logging_data_gpu (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5768950Z test_ddp_model_diff_num_params_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5769398Z test_ddp_model_diff_shape_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5769849Z test_ddp_multiple_nested_unused_params_err_ignore_params (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5770372Z test_ddp_multiple_nested_unused_params_error (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5770799Z test_ddp_namedtuple (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5771180Z test_ddp_new_tensor_in_fwd (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5771595Z test_ddp_new_tensor_in_fwd_static_graph (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5772034Z test_ddp_profiling_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5772453Z test_ddp_profiling_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5772874Z test_ddp_python_error_logged (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5773293Z test_ddp_returns_tensor_with_no_grad (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5773778Z test_ddp_shared_grad_acc_unused_params (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5774196Z test_ddp_static_graph_nested_types (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5774621Z test_ddp_sync_bn_training_vs_eval (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5775034Z test_ddp_sync_module_states (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5775428Z test_ddp_uneven_input_exception (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5775849Z test_ddp_uneven_input_join_disable (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5776256Z test_ddp_uneven_inputs (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5776682Z test_ddp_uneven_inputs_stop_iteration_sync_bn (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5777132Z test_ddp_unused_params_rebuild_buckets_exception (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5777578Z test_ddp_zero_output_features (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5777983Z test_destroy_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5778351Z test_destroy_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5778754Z test_detect_ddp_is_actually_static (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5779180Z test_different_graph_across_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5779583Z test_dump_DDP_relevant_env_vars (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5779968Z test_gather (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5780333Z test_gather_checks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5780709Z test_gather_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5781073Z test_gather_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5781456Z test_gather_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5781831Z test_gather_object (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5782205Z test_gather_object_subgroup (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5782596Z test_get_backend (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5782962Z test_get_future (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5783306Z test_get_rank (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5783690Z test_get_rank_size_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5784087Z test_get_rank_size_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5784486Z test_invalid_static_graph (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5784841Z test_irecv (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5785191Z test_isend (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5785575Z test_isend_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5785962Z test_isend_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5786380Z test_monitored_barrier_allreduce_hang (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5786840Z test_monitored_barrier_allreduce_hang_wait_all_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5787285Z test_monitored_barrier_failure_order (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5787701Z test_monitored_barrier_gloo (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5788171Z test_monitored_barrier_gloo_rank_0_timeout (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5788623Z test_monitored_barrier_gloo_subgroup (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5789098Z test_monitored_barrier_wait_all_ranks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5789710Z test_nccl_backend_bool_allgather (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5790136Z test_nccl_backend_bool_allreduce (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5790543Z test_nccl_backend_bool_broadcast (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5790954Z test_nccl_backend_bool_reduce (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5791369Z test_nccl_high_priority_stream (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5791836Z test_new_subgroups (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5792240Z test_new_subgroups_by_enumeration (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5792718Z test_new_subgroups_by_enumeration_input_rank_exceeds_world_size (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5793220Z test_new_subgroups_by_enumeration_negative_input_rank (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5793682Z test_new_subgroups_group_size_exceeds_world_size (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5794136Z test_new_subgroups_overlap_not_allowed (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5794604Z test_new_subgroups_world_size_not_divisible_by_group_size (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5795048Z test_output_unused_in_loss_dict_module (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5795486Z test_output_unused_in_loss_tuple_module (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5795915Z test_periodic_model_averager (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5796352Z test_periodic_model_averager_param_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5796780Z test_post_localSGD_optimizer_parity (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5797238Z test_post_localSGD_optimizer_parity_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5797724Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5798228Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5798720Z test_post_localSGD_optimizer_step_reload (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5799146Z test_reduce_full_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5799544Z test_reduce_full_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5799936Z test_reduce_full_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5800349Z test_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5800742Z test_reduce_group_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5801113Z test_reduce_group_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5801508Z test_reduce_group_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5801903Z test_reduce_group_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5802281Z test_reduce_max (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5802629Z test_reduce_min (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5803007Z test_reduce_multigpu (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5803391Z test_reduce_product (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5803773Z test_reduce_scatter_tensor_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5804183Z test_reduce_scatter_v_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5804564Z test_reduce_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5804926Z test_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5805312Z test_reduce_sum_cuda_twice (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5805704Z test_reduce_sum_twice (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5806116Z test_scatter (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5806497Z test_scatter_checks (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5806881Z test_scatter_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5807258Z test_scatter_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5807631Z test_scatter_cuda_complex (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5808029Z test_scatter_full_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5808413Z test_scatter_group (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5808784Z test_scatter_object_list (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5809211Z test_send_recv (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5809594Z test_send_recv_any_source (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5810006Z test_send_recv_any_source_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5810457Z test_send_recv_any_source_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5810892Z test_send_recv_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5811293Z test_send_recv_nccl (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5811681Z test_send_recv_nccl_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5812115Z test_send_recv_nccl_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5812530Z test_send_recv_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5812910Z test_send_recv_with_tag (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5813325Z test_send_recv_with_tag_autograd_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5813770Z test_send_recv_with_tag_torch_profiler (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5814172Z test_sparse_all_reduce_sum (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5814581Z test_sparse_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5814996Z test_stateless_api_with_ddp (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5815399Z test_static_graph_api_cpu (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5815771Z test_sync_bn_logged (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5816197Z test_undefined_grad_parity_unused_parameters (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5816651Z test_verify_model_across_rank_with_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5817083Z test_verify_model_across_rank_without_logger (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.5817833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5818300Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5818888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5819358Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5819590Z 2022-11-23T02:12:26.5819700Z Running tests... 2022-11-23T02:12:26.5820110Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5820634Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.5821233Z test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.5821805Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 19940 2022-11-23T02:12:26.5851168Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 19941 2022-11-23T02:12:26.5851915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5852384Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5853106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5853608Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5854185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5854642Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5855219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5855675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5856235Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.5856748Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.5857427Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.5858215Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.5859066Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.5859554Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.5860097Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:12:26.5860937Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:12:26.5861616Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:12:26.5862445Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:12:26.5863112Z [1669168026.584182] [08317a7e7676:19940:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.5863643Z [1669168026.589910] [08317a7e7676:19941:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.5864145Z [1669168026.598513] [08317a7e7676:19940:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.5864627Z [1669168026.598513] [08317a7e7676:19940:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.5865101Z [1669168026.602278] [08317a7e7676:19941:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.5865549Z [1669168026.602278] [08317a7e7676:19941:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.5866077Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:12:26.5866661Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:12:26.5867500Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:12:26.5868412Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:12:26.5869458Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:12:26.5870147Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:12:26.5870999Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:12:26.5871905Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:12:26.5872573Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:12:26.5873403Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:12:26.5874138Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager:Model averaging hierarchy: 2022-11-23T02:12:26.5874953Z INFO:torch.distributed.algorithms.model_averaging.hierarchical_model_averager: Each group that has 2 processes average parameters every 4 iterations, if no higher-level averaging. 2022-11-23T02:12:26.5875501Z ok (7.454s) 2022-11-23T02:12:26.5875657Z 2022-11-23T02:12:26.5875931Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5876263Z Ran 1 test in 7.454s 2022-11-23T02:12:26.5876410Z 2022-11-23T02:12:26.5876504Z OK 2022-11-23T02:12:26.5876638Z 2022-11-23T02:12:26.5876763Z Generating XML reports... 2022-11-23T02:12:26.5877374Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014700.xml 2022-11-23T02:12:26.5878091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5878546Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5879135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5879612Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5879849Z 2022-11-23T02:12:26.5879940Z Running tests... 2022-11-23T02:12:26.5880344Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5880874Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.5881391Z test_3_level_hierarchical_model_averager (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.004s) 2022-11-23T02:12:26.5881704Z 2022-11-23T02:12:26.5881970Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5882302Z Ran 1 test in 0.004s 2022-11-23T02:12:26.5882465Z 2022-11-23T02:12:26.5882573Z OK (skipped=1) 2022-11-23T02:12:26.5882729Z 2022-11-23T02:12:26.5882835Z Generating XML reports... 2022-11-23T02:12:26.5883445Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014710.xml 2022-11-23T02:12:26.5884172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5884627Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5885194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5885667Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5885901Z 2022-11-23T02:12:26.5886010Z Running tests... 2022-11-23T02:12:26.5886395Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5886929Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.5887445Z test_Backend_enum_class (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.5887999Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20088 2022-11-23T02:12:26.5888451Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20089 2022-11-23T02:12:26.5889111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5889568Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5890133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5890611Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5891197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5891749Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5892318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5892789Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5893252Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.5893765Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.5894421Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.5895123Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.5895664Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.5896130Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.5896482Z ok (4.269s) 2022-11-23T02:12:26.5896630Z 2022-11-23T02:12:26.5896905Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5897237Z Ran 1 test in 4.269s 2022-11-23T02:12:26.5897380Z 2022-11-23T02:12:26.5897473Z OK 2022-11-23T02:12:26.5897606Z 2022-11-23T02:12:26.5897730Z Generating XML reports... 2022-11-23T02:12:26.5898343Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014712.xml 2022-11-23T02:12:26.5899051Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5899515Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5900094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5900576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5900791Z 2022-11-23T02:12:26.5900899Z Running tests... 2022-11-23T02:12:26.5901309Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5901842Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.5902366Z test_DistributedDataParallel (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.5903442Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77317 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.658s) 2022-11-23T02:12:26.5903985Z 2022-11-23T02:12:26.5904267Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5904595Z Ran 1 test in 1.658s 2022-11-23T02:12:26.5904759Z 2022-11-23T02:12:26.5904866Z OK (skipped=1) 2022-11-23T02:12:26.5905004Z 2022-11-23T02:12:26.5905128Z Generating XML reports... 2022-11-23T02:12:26.5905796Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014719.xml 2022-11-23T02:12:26.5906531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5906989Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5907552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5908025Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5908256Z 2022-11-23T02:12:26.5908364Z Running tests... 2022-11-23T02:12:26.5908812Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5909794Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.5910351Z test_DistributedDataParallelCPU (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.5910878Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20225 2022-11-23T02:12:26.5911321Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20226 2022-11-23T02:12:26.5911936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5912398Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5912964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5913442Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5914029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5914482Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5915043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5915518Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5915981Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.5916476Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.5917138Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.5917835Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.5918365Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.5918854Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_i4gshmk 2022-11-23T02:12:26.5919401Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_i4gshmk/_remote_module_non_scriptable.py 2022-11-23T02:12:26.5919912Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.5920416Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbjbbv0bo 2022-11-23T02:12:26.5920941Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbjbbv0bo/_remote_module_non_scriptable.py 2022-11-23T02:12:26.5921468Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5921962Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5922479Z [1669168047.471821] [08317a7e7676:20225:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.5923081Z [1669168048.934289] [08317a7e7676:20225:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.5923572Z [1669168048.934289] [08317a7e7676:20225:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.5924090Z [1669168047.493146] [08317a7e7676:20226:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.5924597Z [1669168048.914786] [08317a7e7676:20226:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.5925055Z [1669168048.914786] [08317a7e7676:20226:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.5925466Z ok (6.223s) 2022-11-23T02:12:26.5925617Z 2022-11-23T02:12:26.5925893Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5926204Z Ran 1 test in 6.223s 2022-11-23T02:12:26.5926367Z 2022-11-23T02:12:26.5926461Z OK 2022-11-23T02:12:26.5926594Z 2022-11-23T02:12:26.5926721Z Generating XML reports... 2022-11-23T02:12:26.5927329Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014723.xml 2022-11-23T02:12:26.5928037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5928498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5929078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5929539Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5929779Z 2022-11-23T02:12:26.5929887Z Running tests... 2022-11-23T02:12:26.5930290Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5930823Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.5931371Z test_DistributedDataParallelCPU_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.5931913Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20339 2022-11-23T02:12:26.5932371Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20340 2022-11-23T02:12:26.5932970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5933422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5933999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5934476Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5935047Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5935501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5936076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5936546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5936989Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.5937499Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.5938163Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.5938846Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.5939377Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.5939943Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplpfpivfx 2022-11-23T02:12:26.5940503Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplpfpivfx/_remote_module_non_scriptable.py 2022-11-23T02:12:26.5941007Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.5941513Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpol1d9vu3 2022-11-23T02:12:26.5942057Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpol1d9vu3/_remote_module_non_scriptable.py 2022-11-23T02:12:26.5942579Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5943108Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5943642Z [1669168056.323401] [08317a7e7676:20340:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.5944157Z [1669168057.714179] [08317a7e7676:20340:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.5944629Z [1669168057.714179] [08317a7e7676:20340:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.5945127Z [1669168056.302100] [08317a7e7676:20339:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.5945627Z [1669168057.715490] [08317a7e7676:20339:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.5946094Z [1669168057.715490] [08317a7e7676:20339:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.5946422Z ok (6.253s) 2022-11-23T02:12:26.5946572Z 2022-11-23T02:12:26.5946847Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5947178Z Ran 1 test in 6.254s 2022-11-23T02:12:26.5947344Z 2022-11-23T02:12:26.5947437Z OK 2022-11-23T02:12:26.5947552Z 2022-11-23T02:12:26.5947676Z Generating XML reports... 2022-11-23T02:12:26.5948287Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014732.xml 2022-11-23T02:12:26.5949242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5949696Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5950281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5950762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5950996Z 2022-11-23T02:12:26.5951098Z Running tests... 2022-11-23T02:12:26.5951481Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5952010Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.5952570Z test_DistributedDataParallel_SyncBatchNorm (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.5953107Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20453 2022-11-23T02:12:26.5953548Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20454 2022-11-23T02:12:26.5954158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5954612Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5955184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5955658Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5956313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5956769Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5957334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5957801Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5958262Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.5958753Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.5959420Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.5960192Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.5960726Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.5961191Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.5961693Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgblu_sxf 2022-11-23T02:12:26.5962241Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgblu_sxf/_remote_module_non_scriptable.py 2022-11-23T02:12:26.5962779Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpee77hu70 2022-11-23T02:12:26.5963304Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpee77hu70/_remote_module_non_scriptable.py 2022-11-23T02:12:26.5963864Z [1669168066.382972] [08317a7e7676:20453:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.5964374Z [1669168066.396543] [08317a7e7676:20453:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.5964856Z [1669168066.396543] [08317a7e7676:20453:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.5965354Z [1669168066.384633] [08317a7e7676:20454:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.5965853Z [1669168066.397662] [08317a7e7676:20454:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.5966324Z [1669168066.397662] [08317a7e7676:20454:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.5966804Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5967278Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5967768Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5968261Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5968726Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5969213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5969693Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5970175Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5970644Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5971129Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5971607Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5972119Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5972482Z ok (7.218s) 2022-11-23T02:12:26.5972629Z 2022-11-23T02:12:26.5972907Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5973238Z Ran 1 test in 7.218s 2022-11-23T02:12:26.5973382Z 2022-11-23T02:12:26.5973474Z OK 2022-11-23T02:12:26.5973607Z 2022-11-23T02:12:26.5973730Z Generating XML reports... 2022-11-23T02:12:26.5974340Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014741.xml 2022-11-23T02:12:26.5975044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5975574Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5976158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5976642Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5976875Z 2022-11-23T02:12:26.5976966Z Running tests... 2022-11-23T02:12:26.5977368Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5977894Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.5978443Z test_DistributedDataParallel_SyncBatchNorm_2D_Input (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.5978986Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20571 2022-11-23T02:12:26.5979444Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20572 2022-11-23T02:12:26.5980065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5980505Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5981083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5981560Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5982130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5982551Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5983106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5983562Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5984004Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.5984501Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.5985161Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.5985846Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.5986353Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.5986811Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.5987307Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf4ddxwc5 2022-11-23T02:12:26.5987854Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf4ddxwc5/_remote_module_non_scriptable.py 2022-11-23T02:12:26.5988377Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq7h0tiem 2022-11-23T02:12:26.5988971Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq7h0tiem/_remote_module_non_scriptable.py 2022-11-23T02:12:26.5989820Z [1669168076.205945] [08317a7e7676:20571:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.5990329Z [1669168076.219581] [08317a7e7676:20571:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.5990808Z [1669168076.219581] [08317a7e7676:20571:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.5991323Z [1669168076.208772] [08317a7e7676:20572:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.5991824Z [1669168076.222079] [08317a7e7676:20572:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.5992358Z [1669168076.222079] [08317a7e7676:20572:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.5992828Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5993322Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5993816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5994288Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.5994639Z ok (6.159s) 2022-11-23T02:12:26.5994788Z 2022-11-23T02:12:26.5995069Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.5995400Z Ran 1 test in 6.160s 2022-11-23T02:12:26.5995544Z 2022-11-23T02:12:26.5995639Z OK 2022-11-23T02:12:26.5995773Z 2022-11-23T02:12:26.5995901Z Generating XML reports... 2022-11-23T02:12:26.5996510Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014750.xml 2022-11-23T02:12:26.5997220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.5997676Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.5998258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.5998739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.5998957Z 2022-11-23T02:12:26.5999066Z Running tests... 2022-11-23T02:12:26.5999469Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6000000Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6000563Z test_DistributedDataParallel_SyncBatchNorm_Channels_Last (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6001121Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20689 2022-11-23T02:12:26.6001581Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20690 2022-11-23T02:12:26.6002203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6002636Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6003218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6003694Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6004263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6004711Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6005291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6005765Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6006260Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6006774Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6007436Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6008135Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6008649Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6009131Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6009691Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpovihxqrj 2022-11-23T02:12:26.6010225Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpovihxqrj/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6010772Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd3r76fo8 2022-11-23T02:12:26.6011321Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd3r76fo8/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6011878Z [1669168084.857883] [08317a7e7676:20689:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6012373Z [1669168084.872054] [08317a7e7676:20689:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6012850Z [1669168084.872054] [08317a7e7676:20689:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6013371Z [1669168084.858563] [08317a7e7676:20690:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6013877Z [1669168084.872078] [08317a7e7676:20690:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6014325Z [1669168084.872078] [08317a7e7676:20690:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6014811Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6015304Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6015794Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6016265Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6016749Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6017231Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6017718Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6018188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6018538Z ok (6.245s) 2022-11-23T02:12:26.6018685Z 2022-11-23T02:12:26.6018957Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6019272Z Ran 1 test in 6.245s 2022-11-23T02:12:26.6019432Z 2022-11-23T02:12:26.6019526Z OK 2022-11-23T02:12:26.6019660Z 2022-11-23T02:12:26.6019784Z Generating XML reports... 2022-11-23T02:12:26.6020383Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014759.xml 2022-11-23T02:12:26.6021117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6021575Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6022208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6022675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6022910Z 2022-11-23T02:12:26.6023017Z Running tests... 2022-11-23T02:12:26.6023424Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6023954Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6024539Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6025117Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20807 2022-11-23T02:12:26.6025632Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20808 2022-11-23T02:12:26.6026232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6026691Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6027271Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6027743Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6028311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6028765Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6029576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6030037Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6030499Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6031012Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6031682Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6032364Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6032895Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6033375Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6033881Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmz8w4jpq 2022-11-23T02:12:26.6034419Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmz8w4jpq/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6034959Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsh39mulu 2022-11-23T02:12:26.6035502Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsh39mulu/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6036057Z [1669168093.718553] [08317a7e7676:20807:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6036545Z [1669168093.732266] [08317a7e7676:20807:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6037016Z [1669168093.732266] [08317a7e7676:20807:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6037539Z [1669168093.722151] [08317a7e7676:20808:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6038042Z [1669168093.735641] [08317a7e7676:20808:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6038569Z [1669168093.735641] [08317a7e7676:20808:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6039061Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6039556Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6039896Z ok (6.366s) 2022-11-23T02:12:26.6040044Z 2022-11-23T02:12:26.6040318Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6040647Z Ran 1 test in 6.366s 2022-11-23T02:12:26.6040811Z 2022-11-23T02:12:26.6040904Z OK 2022-11-23T02:12:26.6041021Z 2022-11-23T02:12:26.6041145Z Generating XML reports... 2022-11-23T02:12:26.6041756Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014808.xml 2022-11-23T02:12:26.6042555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6043000Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6043584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6044059Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6044292Z 2022-11-23T02:12:26.6044400Z Running tests... 2022-11-23T02:12:26.6044788Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6045315Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6045910Z test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6046482Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 20925 2022-11-23T02:12:26.6046921Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 20926 2022-11-23T02:12:26.6047538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6047995Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6048558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6049032Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6049610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6050056Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6050615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6051089Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6051553Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6052042Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6052710Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6053407Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6053939Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6054404Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6054909Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0kuaovt3 2022-11-23T02:12:26.6055459Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0kuaovt3/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6056051Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzbxg9kd9 2022-11-23T02:12:26.6056586Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzbxg9kd9/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6057148Z [1669168102.582505] [08317a7e7676:20925:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6057660Z [1669168102.596391] [08317a7e7676:20925:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6058134Z [1669168102.596391] [08317a7e7676:20925:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6058683Z [1669168102.584407] [08317a7e7676:20926:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6059184Z [1669168102.597951] [08317a7e7676:20926:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6059655Z [1669168102.597951] [08317a7e7676:20926:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6060142Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6060618Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6061108Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6061595Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6061930Z ok (7.021s) 2022-11-23T02:12:26.6062081Z 2022-11-23T02:12:26.6062361Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6062694Z Ran 1 test in 7.021s 2022-11-23T02:12:26.6062856Z 2022-11-23T02:12:26.6062949Z OK 2022-11-23T02:12:26.6063064Z 2022-11-23T02:12:26.6063188Z Generating XML reports... 2022-11-23T02:12:26.6063802Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014817.xml 2022-11-23T02:12:26.6064525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6064968Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6065550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6066030Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6066267Z 2022-11-23T02:12:26.6066376Z Running tests... 2022-11-23T02:12:26.6066766Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6067297Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6067872Z test_DistributedDataParallel_SyncBatchNorm_No_Affine (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6068404Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21043 2022-11-23T02:12:26.6068862Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21044 2022-11-23T02:12:26.6069713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6070173Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6070737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6071216Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6071802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6072235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6072886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6073374Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6073839Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6074330Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6074997Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6075700Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6076304Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6076766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6077271Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5tqg0k8j 2022-11-23T02:12:26.6077823Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5tqg0k8j/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6078343Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4hvm2yub 2022-11-23T02:12:26.6078888Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4hvm2yub/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6079448Z [1669168112.253763] [08317a7e7676:21044:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6079968Z [1669168112.267986] [08317a7e7676:21044:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6080427Z [1669168112.267986] [08317a7e7676:21044:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6080945Z [1669168112.253741] [08317a7e7676:21043:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6081443Z [1669168112.268013] [08317a7e7676:21043:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6081913Z [1669168112.268013] [08317a7e7676:21043:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6082374Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6082860Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6083358Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6083845Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6084179Z ok (6.765s) 2022-11-23T02:12:26.6084330Z 2022-11-23T02:12:26.6084605Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6084933Z Ran 1 test in 6.765s 2022-11-23T02:12:26.6085096Z 2022-11-23T02:12:26.6085171Z OK 2022-11-23T02:12:26.6085306Z 2022-11-23T02:12:26.6085429Z Generating XML reports... 2022-11-23T02:12:26.6086042Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014826.xml 2022-11-23T02:12:26.6086768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6087206Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6087787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6088263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6088493Z 2022-11-23T02:12:26.6088653Z Running tests... 2022-11-23T02:12:26.6089094Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6089632Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6090230Z test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6090788Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21161 2022-11-23T02:12:26.6091249Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21162 2022-11-23T02:12:26.6091871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6092401Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6092967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6093448Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6094031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6094464Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6095037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6095510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6095972Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6096624Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6097167Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6097824Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6098357Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6098818Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6099321Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzl6f2io3 2022-11-23T02:12:26.6099874Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzl6f2io3/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6100399Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbnaa4tw_ 2022-11-23T02:12:26.6100948Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbnaa4tw_/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6101516Z [1669168121.490077] [08317a7e7676:21162:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6102028Z [1669168121.503368] [08317a7e7676:21162:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6102488Z [1669168121.503368] [08317a7e7676:21162:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6103003Z [1669168121.483906] [08317a7e7676:21161:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6103505Z [1669168121.497670] [08317a7e7676:21161:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6103979Z [1669168121.497670] [08317a7e7676:21161:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6104443Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6104990Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6105490Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6105979Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6106315Z ok (6.170s) 2022-11-23T02:12:26.6106463Z 2022-11-23T02:12:26.6106738Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6107068Z Ran 1 test in 6.171s 2022-11-23T02:12:26.6107229Z 2022-11-23T02:12:26.6107303Z OK 2022-11-23T02:12:26.6107439Z 2022-11-23T02:12:26.6107564Z Generating XML reports... 2022-11-23T02:12:26.6108173Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014836.xml 2022-11-23T02:12:26.6109168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6109631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6110216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6110692Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6110926Z 2022-11-23T02:12:26.6111018Z Running tests... 2022-11-23T02:12:26.6111421Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6111955Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6112517Z test_DistributedDataParallel_non_default_stream (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6113588Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/76428 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.596s) 2022-11-23T02:12:26.6114113Z 2022-11-23T02:12:26.6114381Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6114709Z Ran 1 test in 1.596s 2022-11-23T02:12:26.6114874Z 2022-11-23T02:12:26.6114980Z OK (skipped=1) 2022-11-23T02:12:26.6115136Z 2022-11-23T02:12:26.6115242Z Generating XML reports... 2022-11-23T02:12:26.6115847Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014844.xml 2022-11-23T02:12:26.6116561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6117020Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6117584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6118062Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6118293Z 2022-11-23T02:12:26.6118401Z Running tests... 2022-11-23T02:12:26.6118788Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6119314Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6119870Z test_DistributedDataParallel_requires_grad (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6120407Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21313 2022-11-23T02:12:26.6120848Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21314 2022-11-23T02:12:26.6121460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6121921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6122560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6123048Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6123637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6124089Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6124646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6125118Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6125582Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6126163Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6126819Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6127518Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6128055Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6128517Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6128867Z ok (4.370s) 2022-11-23T02:12:26.6129017Z 2022-11-23T02:12:26.6129286Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6129614Z Ran 1 test in 4.370s 2022-11-23T02:12:26.6129759Z 2022-11-23T02:12:26.6129851Z OK 2022-11-23T02:12:26.6129990Z 2022-11-23T02:12:26.6130113Z Generating XML reports... 2022-11-23T02:12:26.6130725Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014848.xml 2022-11-23T02:12:26.6131435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6131890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6132474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6132950Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6133166Z 2022-11-23T02:12:26.6133274Z Running tests... 2022-11-23T02:12:26.6133677Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6134209Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6134768Z test_DistributedDataParallel_with_amp_and_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6135853Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77294 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.641s) 2022-11-23T02:12:26.6136385Z 2022-11-23T02:12:26.6136649Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6136981Z Ran 1 test in 1.642s 2022-11-23T02:12:26.6137143Z 2022-11-23T02:12:26.6137250Z OK (skipped=1) 2022-11-23T02:12:26.6137388Z 2022-11-23T02:12:26.6137511Z Generating XML reports... 2022-11-23T02:12:26.6138115Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014855.xml 2022-11-23T02:12:26.6138835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6139279Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6139917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6140400Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6140633Z 2022-11-23T02:12:26.6140740Z Running tests... 2022-11-23T02:12:26.6141126Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6141659Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6142200Z test_DistributedSampler_padding (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6142700Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21450 2022-11-23T02:12:26.6143155Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21451 2022-11-23T02:12:26.6143825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6144283Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6144850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6145328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6145914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6146349Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6146923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6147393Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6147857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6148347Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6149246Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6149966Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6150501Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6150963Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6151483Z [1669168145.373496] [08317a7e7676:21451:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6151993Z [1669168145.386934] [08317a7e7676:21451:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6152470Z [1669168145.386934] [08317a7e7676:21451:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6152974Z [1669168145.364292] [08317a7e7676:21450:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6153474Z [1669168145.378057] [08317a7e7676:21450:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6153944Z [1669168145.378057] [08317a7e7676:21450:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6154285Z ok (6.219s) 2022-11-23T02:12:26.6154416Z 2022-11-23T02:12:26.6154689Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6155024Z Ran 1 test in 6.219s 2022-11-23T02:12:26.6155188Z 2022-11-23T02:12:26.6155281Z OK 2022-11-23T02:12:26.6155414Z 2022-11-23T02:12:26.6155520Z Generating XML reports... 2022-11-23T02:12:26.6156125Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014900.xml 2022-11-23T02:12:26.6156978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6157452Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6158023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6158500Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6158735Z 2022-11-23T02:12:26.6158843Z Running tests... 2022-11-23T02:12:26.6159228Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6159835Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6160349Z test_SyncBatchNorm_process_group (__main__.TestDistBackendWithSpawn) ... skip: no torchvision (0.002s) 2022-11-23T02:12:26.6160640Z 2022-11-23T02:12:26.6160906Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6161214Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6161374Z 2022-11-23T02:12:26.6161482Z OK (skipped=1) 2022-11-23T02:12:26.6161637Z 2022-11-23T02:12:26.6161760Z Generating XML reports... 2022-11-23T02:12:26.6162353Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014908.xml 2022-11-23T02:12:26.6163073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6163527Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6164106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6164566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6164799Z 2022-11-23T02:12:26.6164907Z Running tests... 2022-11-23T02:12:26.6165315Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6165831Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6166295Z test_accumulate_gradients_no_sync (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.6166808Z Runs _test_accumulate_gradients_no_sync using default inputs ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T02:12:26.6167115Z 2022-11-23T02:12:26.6167383Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6167697Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6167858Z 2022-11-23T02:12:26.6167965Z OK (skipped=1) 2022-11-23T02:12:26.6168120Z 2022-11-23T02:12:26.6168249Z Generating XML reports... 2022-11-23T02:12:26.6168838Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014911.xml 2022-11-23T02:12:26.6169559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6170014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6170598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6171057Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6171291Z 2022-11-23T02:12:26.6171398Z Running tests... 2022-11-23T02:12:26.6171801Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6172329Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6172794Z test_accumulate_gradients_no_sync_allreduce_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.6173332Z Runs multiple iterations on _test_accumulate_gradients_no_sync ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T02:12:26.6173644Z 2022-11-23T02:12:26.6173960Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6174275Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6174437Z 2022-11-23T02:12:26.6174545Z OK (skipped=1) 2022-11-23T02:12:26.6174700Z 2022-11-23T02:12:26.6174822Z Generating XML reports... 2022-11-23T02:12:26.6175482Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014913.xml 2022-11-23T02:12:26.6176193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6176647Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6177227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6177742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6177975Z 2022-11-23T02:12:26.6178079Z Running tests... 2022-11-23T02:12:26.6178487Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6179015Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6179501Z test_accumulate_gradients_no_sync_allreduce_with_then_hook (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.6180073Z Runs multiple iterations on _test_accumulate_gradients_no_sync using allreduce ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T02:12:26.6180405Z 2022-11-23T02:12:26.6180666Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6180995Z Ran 1 test in 0.003s 2022-11-23T02:12:26.6181139Z 2022-11-23T02:12:26.6181250Z OK (skipped=1) 2022-11-23T02:12:26.6181404Z 2022-11-23T02:12:26.6181526Z Generating XML reports... 2022-11-23T02:12:26.6182132Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014915.xml 2022-11-23T02:12:26.6182836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6183291Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6183869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6184340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6184554Z 2022-11-23T02:12:26.6184661Z Running tests... 2022-11-23T02:12:26.6185061Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6185595Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6186060Z test_accumulate_gradients_no_sync_grad_is_view (__main__.TestDistBackendWithSpawn) 2022-11-23T02:12:26.6186585Z Runs _test_accumulate_gradients_no_sync using default inputs ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T02:12:26.6186896Z 2022-11-23T02:12:26.6187162Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6187488Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6187649Z 2022-11-23T02:12:26.6187738Z OK (skipped=1) 2022-11-23T02:12:26.6187892Z 2022-11-23T02:12:26.6188016Z Generating XML reports... 2022-11-23T02:12:26.6188619Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014918.xml 2022-11-23T02:12:26.6189379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6189978Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6190565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6191038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6191269Z 2022-11-23T02:12:26.6191431Z Running tests... 2022-11-23T02:12:26.6191850Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6192378Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6192876Z test_all_gather (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6193345Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 21729 2022-11-23T02:12:26.6193806Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 21730 2022-11-23T02:12:26.6194422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6194929Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6195512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6195989Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6196572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6197003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6197579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6198052Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6198495Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6199004Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6199669Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6200376Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6200895Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6201484Z STAGE:2022-11-23 01:49:24 21729:21729 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6201969Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6202552Z STAGE:2022-11-23 01:49:24 21730:21730 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6203058Z [1669168164.758821] [08317a7e7676:21729:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6203574Z [1669168166.403911] [08317a7e7676:21729:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6204052Z [1669168166.403911] [08317a7e7676:21729:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6204647Z STAGE:2022-11-23 01:49:26 21729:21729 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6205158Z [1669168164.760538] [08317a7e7676:21730:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6205662Z [1669168166.437333] [08317a7e7676:21730:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6206132Z [1669168166.437333] [08317a7e7676:21730:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6206720Z STAGE:2022-11-23 01:49:26 21730:21730 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6207308Z STAGE:2022-11-23 01:49:26 21729:21729 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6207970Z STAGE:2022-11-23 01:49:26 21730:21730 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6208561Z STAGE:2022-11-23 01:49:26 21729:21729 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6209132Z STAGE:2022-11-23 01:49:26 21730:21730 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6209700Z STAGE:2022-11-23 01:49:26 21729:21729 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6210515Z STAGE:2022-11-23 01:49:26 21730:21730 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:49:26 21729:21729 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6210980Z 2022-11-23T02:12:26.6211330Z STAGE:2022-11-23 01:49:26 21730:21730 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6211690Z ok (6.680s) 2022-11-23T02:12:26.6211818Z 2022-11-23T02:12:26.6212088Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6212416Z Ran 1 test in 6.680s 2022-11-23T02:12:26.6212578Z 2022-11-23T02:12:26.6212672Z OK 2022-11-23T02:12:26.6212804Z 2022-11-23T02:12:26.6212911Z Generating XML reports... 2022-11-23T02:12:26.6213520Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014920.xml 2022-11-23T02:12:26.6214244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6214704Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6215268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6215752Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6215984Z 2022-11-23T02:12:26.6216093Z Running tests... 2022-11-23T02:12:26.6216482Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6217017Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6217572Z test_all_gather_coalesced_complex (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T02:12:26.6217898Z 2022-11-23T02:12:26.6218163Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6218472Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6218631Z 2022-11-23T02:12:26.6218739Z OK (skipped=1) 2022-11-23T02:12:26.6218895Z 2022-11-23T02:12:26.6219019Z Generating XML reports... 2022-11-23T02:12:26.6219609Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014930.xml 2022-11-23T02:12:26.6220340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6220798Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6221385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6221845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6222082Z 2022-11-23T02:12:26.6222191Z Running tests... 2022-11-23T02:12:26.6222592Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6223109Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6223413Z test_all_gather_coalesced_full_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T02:12:26.6223436Z 2022-11-23T02:12:26.6223700Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6223811Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6223830Z 2022-11-23T02:12:26.6223938Z OK (skipped=1) 2022-11-23T02:12:26.6223957Z 2022-11-23T02:12:26.6224131Z Generating XML reports... 2022-11-23T02:12:26.6224589Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014932.xml 2022-11-23T02:12:26.6224965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6225142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6225507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6225699Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6225719Z 2022-11-23T02:12:26.6225883Z Running tests... 2022-11-23T02:12:26.6226147Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6226462Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6226756Z test_all_gather_coalesced_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T02:12:26.6226776Z 2022-11-23T02:12:26.6227034Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6227146Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6227165Z 2022-11-23T02:12:26.6227274Z OK (skipped=1) 2022-11-23T02:12:26.6227292Z 2022-11-23T02:12:26.6227397Z Generating XML reports... 2022-11-23T02:12:26.6227842Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014934.xml 2022-11-23T02:12:26.6228215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6228395Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6228779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6229194Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6229216Z 2022-11-23T02:12:26.6229330Z Running tests... 2022-11-23T02:12:26.6229597Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6229894Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6230189Z test_all_gather_coalesced_simple (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.002s) 2022-11-23T02:12:26.6230208Z 2022-11-23T02:12:26.6230469Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6230580Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6230603Z 2022-11-23T02:12:26.6230710Z OK (skipped=1) 2022-11-23T02:12:26.6230729Z 2022-11-23T02:12:26.6230852Z Generating XML reports... 2022-11-23T02:12:26.6231294Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014937.xml 2022-11-23T02:12:26.6231669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6231844Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6232209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6232401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6232421Z 2022-11-23T02:12:26.6232527Z Running tests... 2022-11-23T02:12:26.6232787Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6233096Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6233393Z test_all_gather_coalesced_with_empty (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support all_gather_coalesced (0.003s) 2022-11-23T02:12:26.6233413Z 2022-11-23T02:12:26.6233750Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6233868Z Ran 1 test in 0.003s 2022-11-23T02:12:26.6233887Z 2022-11-23T02:12:26.6233994Z OK (skipped=1) 2022-11-23T02:12:26.6234014Z 2022-11-23T02:12:26.6234118Z Generating XML reports... 2022-11-23T02:12:26.6234568Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014939.xml 2022-11-23T02:12:26.6234945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6235122Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6235504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6235765Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6235785Z 2022-11-23T02:12:26.6235893Z Running tests... 2022-11-23T02:12:26.6236161Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6236474Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6236720Z test_all_gather_complex (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6236943Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22008 2022-11-23T02:12:26.6237164Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22009 2022-11-23T02:12:26.6237541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6237716Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6238107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6238300Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6238674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6238830Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6239211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6239399Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6239650Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6239900Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6240309Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6240711Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6240946Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6241284Z STAGE:2022-11-23 01:49:45 22008:22008 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6241498Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6241829Z STAGE:2022-11-23 01:49:46 22009:22009 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6242105Z [1669168186.032959] [08317a7e7676:22008:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6242340Z [1669168187.669366] [08317a7e7676:22008:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6242580Z [1669168187.669366] [08317a7e7676:22008:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6242900Z [1669168186.053736] [08317a7e7676:22009:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6243136Z [1669168187.659616] [08317a7e7676:22009:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6243372Z [1669168187.659616] [08317a7e7676:22009:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6243936Z STAGE:2022-11-23 01:49:48 22008:22008 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:49:48 22009:22009 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6244000Z 2022-11-23T02:12:26.6244358Z STAGE:2022-11-23 01:49:48 22008:22008 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6244707Z STAGE:2022-11-23 01:49:48 22009:22009 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6245025Z STAGE:2022-11-23 01:49:48 22009:22009 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6245347Z STAGE:2022-11-23 01:49:48 22008:22008 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6245683Z STAGE:2022-11-23 01:49:48 22009:22009 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6246031Z STAGE:2022-11-23 01:49:48 22009:22009 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6246366Z STAGE:2022-11-23 01:49:48 22008:22008 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6246712Z STAGE:2022-11-23 01:49:48 22008:22008 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6246817Z ok (6.585s) 2022-11-23T02:12:26.6246836Z 2022-11-23T02:12:26.6247100Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6247214Z Ran 1 test in 6.585s 2022-11-23T02:12:26.6247237Z 2022-11-23T02:12:26.6247311Z OK 2022-11-23T02:12:26.6247330Z 2022-11-23T02:12:26.6247455Z Generating XML reports... 2022-11-23T02:12:26.6247906Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014942.xml 2022-11-23T02:12:26.6248284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6248460Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6248844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6249040Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6249060Z 2022-11-23T02:12:26.6249169Z Running tests... 2022-11-23T02:12:26.6249415Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6249736Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6250004Z test_all_gather_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all gather (0.002s) 2022-11-23T02:12:26.6250024Z 2022-11-23T02:12:26.6250281Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6250392Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6250411Z 2022-11-23T02:12:26.6250517Z OK (skipped=1) 2022-11-23T02:12:26.6250536Z 2022-11-23T02:12:26.6250659Z Generating XML reports... 2022-11-23T02:12:26.6251109Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014951.xml 2022-11-23T02:12:26.6251490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6251650Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6252080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6252277Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6252296Z 2022-11-23T02:12:26.6252406Z Running tests... 2022-11-23T02:12:26.6252668Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6252982Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6253259Z test_all_gather_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all gather (0.002s) 2022-11-23T02:12:26.6253279Z 2022-11-23T02:12:26.6253539Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6253696Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6253715Z 2022-11-23T02:12:26.6253804Z OK (skipped=1) 2022-11-23T02:12:26.6253823Z 2022-11-23T02:12:26.6253946Z Generating XML reports... 2022-11-23T02:12:26.6254399Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014953.xml 2022-11-23T02:12:26.6254777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6254953Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6255334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6255526Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6255546Z 2022-11-23T02:12:26.6255655Z Running tests... 2022-11-23T02:12:26.6255897Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6256215Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6256482Z test_all_gather_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6256708Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22188 2022-11-23T02:12:26.6256927Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22189 2022-11-23T02:12:26.6257301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6257477Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6257860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6258052Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6258407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6258580Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6258964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6259154Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6259402Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6259650Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6260053Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6260455Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6260691Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6260919Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.6261190Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6261436Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.6261840Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6262233Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6262568Z STAGE:2022-11-23 01:50:00 22189:22189 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6262892Z STAGE:2022-11-23 01:50:00 22188:22188 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6263220Z [1669168200.021739] [08317a7e7676:22188:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6263455Z [1669168201.680220] [08317a7e7676:22188:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6263676Z [1669168201.680220] [08317a7e7676:22188:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6263949Z [1669168200.042646] [08317a7e7676:22189:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6264179Z [1669168201.720228] [08317a7e7676:22189:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6264415Z [1669168201.720228] [08317a7e7676:22189:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6264977Z STAGE:2022-11-23 01:50:02 22188:22188 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:50:02 22189:22189 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6264997Z 2022-11-23T02:12:26.6265350Z STAGE:2022-11-23 01:50:02 22189:22189 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6265700Z STAGE:2022-11-23 01:50:02 22188:22188 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6266030Z STAGE:2022-11-23 01:50:02 22188:22188 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6266353Z STAGE:2022-11-23 01:50:02 22189:22189 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6266688Z STAGE:2022-11-23 01:50:02 22188:22188 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6267019Z STAGE:2022-11-23 01:50:02 22188:22188 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6267363Z STAGE:2022-11-23 01:50:02 22189:22189 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6267708Z STAGE:2022-11-23 01:50:02 22189:22189 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6267812Z ok (6.876s) 2022-11-23T02:12:26.6267831Z 2022-11-23T02:12:26.6268096Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6268210Z Ran 1 test in 6.877s 2022-11-23T02:12:26.6268229Z 2022-11-23T02:12:26.6268321Z OK 2022-11-23T02:12:26.6268340Z 2022-11-23T02:12:26.6268464Z Generating XML reports... 2022-11-23T02:12:26.6268914Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014956.xml 2022-11-23T02:12:26.6269499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6269683Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6270070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6270264Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6270355Z 2022-11-23T02:12:26.6270469Z Running tests... 2022-11-23T02:12:26.6270734Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6271052Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6271312Z test_all_gather_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6271515Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22302 2022-11-23T02:12:26.6271735Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22303 2022-11-23T02:12:26.6272110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6272353Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6272727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6272909Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6273292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6273484Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6273868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6274044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6274293Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6274542Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6274949Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6275353Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6275587Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6275821Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6275983Z skip: Skipped due to small world size. (4.220s) 2022-11-23T02:12:26.6276003Z 2022-11-23T02:12:26.6276269Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6276363Z Ran 1 test in 4.220s 2022-11-23T02:12:26.6276382Z 2022-11-23T02:12:26.6276489Z OK (skipped=1) 2022-11-23T02:12:26.6276511Z 2022-11-23T02:12:26.6276634Z Generating XML reports... 2022-11-23T02:12:26.6277086Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015005.xml 2022-11-23T02:12:26.6277465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6277642Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6278028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6278220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6278239Z 2022-11-23T02:12:26.6278346Z Running tests... 2022-11-23T02:12:26.6278595Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6278908Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6279215Z test_all_gather_into_cat_tensor_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_gather_into_tensor (0.002s) 2022-11-23T02:12:26.6279234Z 2022-11-23T02:12:26.6279498Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6279658Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6279679Z 2022-11-23T02:12:26.6279788Z OK (skipped=1) 2022-11-23T02:12:26.6279806Z 2022-11-23T02:12:26.6279931Z Generating XML reports... 2022-11-23T02:12:26.6280383Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015012.xml 2022-11-23T02:12:26.6280740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6280916Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6281300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6281553Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6281572Z 2022-11-23T02:12:26.6281680Z Running tests... 2022-11-23T02:12:26.6281946Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6282260Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6282568Z test_all_gather_into_stack_tensor_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_gather_into_tensor (0.002s) 2022-11-23T02:12:26.6282587Z 2022-11-23T02:12:26.6282847Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6282940Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6282959Z 2022-11-23T02:12:26.6283065Z OK (skipped=1) 2022-11-23T02:12:26.6283084Z 2022-11-23T02:12:26.6283208Z Generating XML reports... 2022-11-23T02:12:26.6283658Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015014.xml 2022-11-23T02:12:26.6284039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6284215Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6284599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6284793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6284812Z 2022-11-23T02:12:26.6284919Z Running tests... 2022-11-23T02:12:26.6285168Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6285482Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6285771Z test_all_gather_multigpu (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl backend supports allgather multigpu (0.002s) 2022-11-23T02:12:26.6285794Z 2022-11-23T02:12:26.6286054Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6286167Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6286186Z 2022-11-23T02:12:26.6286295Z OK (skipped=1) 2022-11-23T02:12:26.6286314Z 2022-11-23T02:12:26.6286437Z Generating XML reports... 2022-11-23T02:12:26.6286886Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015016.xml 2022-11-23T02:12:26.6287259Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6287419Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6287802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6287996Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6288016Z 2022-11-23T02:12:26.6288128Z Running tests... 2022-11-23T02:12:26.6288389Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6288701Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6289090Z test_all_gather_multigpu_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl backend supports allgather multigpu (0.002s) 2022-11-23T02:12:26.6289112Z 2022-11-23T02:12:26.6289380Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6289474Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6289511Z 2022-11-23T02:12:26.6289600Z OK (skipped=1) 2022-11-23T02:12:26.6289619Z 2022-11-23T02:12:26.6289742Z Generating XML reports... 2022-11-23T02:12:26.6290188Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015019.xml 2022-11-23T02:12:26.6290562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6290789Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6291174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6291369Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6291389Z 2022-11-23T02:12:26.6291497Z Running tests... 2022-11-23T02:12:26.6291743Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6292056Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6292335Z test_all_gather_object_default_pg (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6292555Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22537 2022-11-23T02:12:26.6292776Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22538 2022-11-23T02:12:26.6293157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6293332Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6293717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6293893Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6294263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6294437Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6294817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6295006Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6295258Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6295507Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6295912Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6296313Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6296528Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6296762Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6297036Z [1669168225.584580] [08317a7e7676:22537:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6297266Z [1669168226.999857] [08317a7e7676:22537:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6297509Z [1669168226.999857] [08317a7e7676:22537:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6297828Z [1669168225.584604] [08317a7e7676:22538:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6298063Z [1669168226.979311] [08317a7e7676:22538:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6298300Z [1669168226.979311] [08317a7e7676:22538:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6298403Z ok (7.271s) 2022-11-23T02:12:26.6298423Z 2022-11-23T02:12:26.6298695Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6298788Z Ran 1 test in 7.272s 2022-11-23T02:12:26.6298808Z 2022-11-23T02:12:26.6298946Z OK 2022-11-23T02:12:26.6298966Z 2022-11-23T02:12:26.6299091Z Generating XML reports... 2022-11-23T02:12:26.6299546Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015021.xml 2022-11-23T02:12:26.6299926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6300103Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6300485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6300680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6300699Z 2022-11-23T02:12:26.6300808Z Running tests... 2022-11-23T02:12:26.6301053Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6301367Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6301646Z test_all_gather_object_subgroup (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6301868Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 22648 2022-11-23T02:12:26.6302091Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 22649 2022-11-23T02:12:26.6302467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6302645Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6303027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6303201Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6303568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6303744Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6304129Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6304319Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6304571Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6304818Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6305221Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6305621Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6305838Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6306072Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6306315Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.6306607Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.6307017Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6307413Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6307657Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:12:26.6307897Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:12:26.6308291Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:12:26.6308722Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:12:26.6309178Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:12:26.6309435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:12:26.6309833Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:12:26.6310228Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:12:26.6310506Z [1669168235.367786] [08317a7e7676:22649:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6310737Z [1669168236.766211] [08317a7e7676:22649:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6310979Z [1669168236.766211] [08317a7e7676:22649:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6311257Z [1669168235.347201] [08317a7e7676:22648:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6311487Z [1669168236.761455] [08317a7e7676:22648:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6311705Z [1669168236.761455] [08317a7e7676:22648:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6311807Z ok (7.613s) 2022-11-23T02:12:26.6311827Z 2022-11-23T02:12:26.6312098Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6312211Z Ran 1 test in 7.614s 2022-11-23T02:12:26.6312230Z 2022-11-23T02:12:26.6312326Z OK 2022-11-23T02:12:26.6312345Z 2022-11-23T02:12:26.6312470Z Generating XML reports... 2022-11-23T02:12:26.6312921Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015031.xml 2022-11-23T02:12:26.6313298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6313475Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6313846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6314041Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6314060Z 2022-11-23T02:12:26.6314168Z Running tests... 2022-11-23T02:12:26.6314437Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6314750Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6315019Z test_all_gather_v_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports all_gather_v (0.003s) 2022-11-23T02:12:26.6315038Z 2022-11-23T02:12:26.6315299Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6315480Z Ran 1 test in 0.003s 2022-11-23T02:12:26.6315502Z 2022-11-23T02:12:26.6315597Z OK (skipped=1) 2022-11-23T02:12:26.6315634Z 2022-11-23T02:12:26.6315740Z Generating XML reports... 2022-11-23T02:12:26.6316193Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015041.xml 2022-11-23T02:12:26.6316567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6316745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6317126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6317379Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6317399Z 2022-11-23T02:12:26.6317507Z Running tests... 2022-11-23T02:12:26.6317775Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6318076Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6318500Z test_all_reduce_coalesced_full_group_max (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.6318520Z 2022-11-23T02:12:26.6318775Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6318886Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6318905Z 2022-11-23T02:12:26.6319012Z OK (skipped=1) 2022-11-23T02:12:26.6319031Z 2022-11-23T02:12:26.6319156Z Generating XML reports... 2022-11-23T02:12:26.6319604Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015044.xml 2022-11-23T02:12:26.6319985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6320162Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6320534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6320729Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6320748Z 2022-11-23T02:12:26.6320855Z Running tests... 2022-11-23T02:12:26.6321115Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6321426Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6321846Z test_all_reduce_coalesced_full_group_min (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.6321868Z 2022-11-23T02:12:26.6322123Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6322235Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6322254Z 2022-11-23T02:12:26.6322361Z OK (skipped=1) 2022-11-23T02:12:26.6322380Z 2022-11-23T02:12:26.6322483Z Generating XML reports... 2022-11-23T02:12:26.6322930Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015046.xml 2022-11-23T02:12:26.6323303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6323478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6323859Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6324052Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6324071Z 2022-11-23T02:12:26.6324182Z Running tests... 2022-11-23T02:12:26.6324444Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6324739Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6325225Z test_all_reduce_coalesced_full_group_product (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.6325247Z 2022-11-23T02:12:26.6325518Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6325631Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6325650Z 2022-11-23T02:12:26.6325758Z OK (skipped=1) 2022-11-23T02:12:26.6325777Z 2022-11-23T02:12:26.6325901Z Generating XML reports... 2022-11-23T02:12:26.6326351Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015048.xml 2022-11-23T02:12:26.6326724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6327005Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6327376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6327573Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6327593Z 2022-11-23T02:12:26.6327700Z Running tests... 2022-11-23T02:12:26.6327964Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6328275Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6328694Z test_all_reduce_coalesced_full_group_sum (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.6328714Z 2022-11-23T02:12:26.6328971Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6329083Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6329107Z 2022-11-23T02:12:26.6329212Z OK (skipped=1) 2022-11-23T02:12:26.6329232Z 2022-11-23T02:12:26.6329336Z Generating XML reports... 2022-11-23T02:12:26.6329786Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015051.xml 2022-11-23T02:12:26.6330167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6330345Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6330729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6330922Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6330941Z 2022-11-23T02:12:26.6331048Z Running tests... 2022-11-23T02:12:26.6331310Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6331623Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6332026Z test_all_reduce_coalesced_group_max (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.6332045Z 2022-11-23T02:12:26.6332308Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6332419Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6332439Z 2022-11-23T02:12:26.6332546Z OK (skipped=1) 2022-11-23T02:12:26.6332564Z 2022-11-23T02:12:26.6332687Z Generating XML reports... 2022-11-23T02:12:26.6333134Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015053.xml 2022-11-23T02:12:26.6333511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6333687Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6334069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6334249Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6334268Z 2022-11-23T02:12:26.6334376Z Running tests... 2022-11-23T02:12:26.6334688Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6335008Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6335420Z test_all_reduce_coalesced_group_min (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.6335441Z 2022-11-23T02:12:26.6335699Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6335810Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6335829Z 2022-11-23T02:12:26.6335936Z OK (skipped=1) 2022-11-23T02:12:26.6335955Z 2022-11-23T02:12:26.6336058Z Generating XML reports... 2022-11-23T02:12:26.6336556Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015056.xml 2022-11-23T02:12:26.6336929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6337110Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6337494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6337687Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6337706Z 2022-11-23T02:12:26.6337815Z Running tests... 2022-11-23T02:12:26.6338079Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6338389Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6338795Z test_all_reduce_coalesced_group_product (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.6338836Z 2022-11-23T02:12:26.6339080Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6339190Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6339209Z 2022-11-23T02:12:26.6339315Z OK (skipped=1) 2022-11-23T02:12:26.6339336Z 2022-11-23T02:12:26.6339460Z Generating XML reports... 2022-11-23T02:12:26.6339906Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015058.xml 2022-11-23T02:12:26.6340278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6340453Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6340829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6341021Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6341043Z 2022-11-23T02:12:26.6341156Z Running tests... 2022-11-23T02:12:26.6341403Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6341718Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6342138Z test_all_reduce_coalesced_group_sum (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.6342158Z 2022-11-23T02:12:26.6342421Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6342533Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6342553Z 2022-11-23T02:12:26.6342658Z OK (skipped=1) 2022-11-23T02:12:26.6342676Z 2022-11-23T02:12:26.6342799Z Generating XML reports... 2022-11-23T02:12:26.6343244Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015100.xml 2022-11-23T02:12:26.6343618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6343782Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6344216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6344416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6344435Z 2022-11-23T02:12:26.6344542Z Running tests... 2022-11-23T02:12:26.6344805Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6345120Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6345520Z test_all_reduce_coalesced_max (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.6345540Z 2022-11-23T02:12:26.6345798Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6345971Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6345990Z 2022-11-23T02:12:26.6346077Z OK (skipped=1) 2022-11-23T02:12:26.6346096Z 2022-11-23T02:12:26.6346218Z Generating XML reports... 2022-11-23T02:12:26.6346666Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015103.xml 2022-11-23T02:12:26.6347040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6347216Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6347597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6347791Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6347810Z 2022-11-23T02:12:26.6347919Z Running tests... 2022-11-23T02:12:26.6348161Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6348483Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6348785Z test_all_reduce_coalesced_max_complex_unsupported (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6349229Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23110 2022-11-23T02:12:26.6349462Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23111 2022-11-23T02:12:26.6349844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6350024Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6350408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6350600Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6350952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6351132Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6351512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6351702Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6351952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6352203Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6352609Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6353010Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6353249Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6354060Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:12:26.6354182Z warnings.warn( 2022-11-23T02:12:26.6354418Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6355167Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:12:26.6355279Z warnings.warn( 2022-11-23T02:12:26.6355379Z ok (4.201s) 2022-11-23T02:12:26.6355452Z 2022-11-23T02:12:26.6355723Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6355834Z Ran 1 test in 4.201s 2022-11-23T02:12:26.6355853Z 2022-11-23T02:12:26.6355944Z OK 2022-11-23T02:12:26.6355963Z 2022-11-23T02:12:26.6356069Z Generating XML reports... 2022-11-23T02:12:26.6356522Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015105.xml 2022-11-23T02:12:26.6356896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6357071Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6357455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6357649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6357668Z 2022-11-23T02:12:26.6357775Z Running tests... 2022-11-23T02:12:26.6358043Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6358359Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6358744Z test_all_reduce_coalesced_min (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.6358764Z 2022-11-23T02:12:26.6359024Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6359136Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6359155Z 2022-11-23T02:12:26.6359262Z OK (skipped=1) 2022-11-23T02:12:26.6359280Z 2022-11-23T02:12:26.6359401Z Generating XML reports... 2022-11-23T02:12:26.6359849Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015112.xml 2022-11-23T02:12:26.6360228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6360408Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6360793Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6360973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6360992Z 2022-11-23T02:12:26.6361099Z Running tests... 2022-11-23T02:12:26.6361362Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6361674Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6362083Z test_all_reduce_coalesced_product (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.6362103Z 2022-11-23T02:12:26.6362364Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6362474Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6362493Z 2022-11-23T02:12:26.6362602Z OK (skipped=1) 2022-11-23T02:12:26.6362623Z 2022-11-23T02:12:26.6362728Z Generating XML reports... 2022-11-23T02:12:26.6363177Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015114.xml 2022-11-23T02:12:26.6363604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6363789Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6364176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6364369Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6364388Z 2022-11-23T02:12:26.6364498Z Running tests... 2022-11-23T02:12:26.6364759Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6365074Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6365513Z test_all_reduce_coalesced_sum (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.6365551Z 2022-11-23T02:12:26.6365801Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6365916Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6365935Z 2022-11-23T02:12:26.6366042Z OK (skipped=1) 2022-11-23T02:12:26.6366061Z 2022-11-23T02:12:26.6366185Z Generating XML reports... 2022-11-23T02:12:26.6366633Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015117.xml 2022-11-23T02:12:26.6367007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6367183Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6367571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6367747Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6367768Z 2022-11-23T02:12:26.6367877Z Running tests... 2022-11-23T02:12:26.6368140Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6368456Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6368748Z test_all_reduce_complex_unsupported_ops (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6368972Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23312 2022-11-23T02:12:26.6369194Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23313 2022-11-23T02:12:26.6369568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6369728Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6370118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6370312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6370686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6370861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6371239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6371427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6371679Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6371927Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6372320Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6372723Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6373002Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6373241Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6373343Z ok (4.350s) 2022-11-23T02:12:26.6373363Z 2022-11-23T02:12:26.6373630Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6373742Z Ran 1 test in 4.350s 2022-11-23T02:12:26.6373761Z 2022-11-23T02:12:26.6373854Z OK 2022-11-23T02:12:26.6373873Z 2022-11-23T02:12:26.6373995Z Generating XML reports... 2022-11-23T02:12:26.6374430Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015119.xml 2022-11-23T02:12:26.6374860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6375038Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6375426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6375618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6375638Z 2022-11-23T02:12:26.6375747Z Running tests... 2022-11-23T02:12:26.6376014Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6376331Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6376587Z test_all_reduce_full_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6376812Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23415 2022-11-23T02:12:26.6377036Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23416 2022-11-23T02:12:26.6377412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6377592Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6377979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6378172Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6378538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6378713Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6379073Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6379270Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6379518Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6379769Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6380170Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6380571Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6380807Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6381055Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.6381281Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6381506Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.6381909Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6382353Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6382702Z STAGE:2022-11-23 01:51:30 23416:23416 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6383026Z STAGE:2022-11-23 01:51:30 23415:23415 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6383304Z [1669168290.355783] [08317a7e7676:23415:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6383538Z [1669168292.010955] [08317a7e7676:23415:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6383831Z [1669168292.010955] [08317a7e7676:23415:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6384109Z [1669168290.378516] [08317a7e7676:23416:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6384321Z [1669168292.044426] [08317a7e7676:23416:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6384557Z [1669168292.044426] [08317a7e7676:23416:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6385122Z STAGE:2022-11-23 01:51:32 23415:23415 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:51:32 23416:23416 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6385142Z 2022-11-23T02:12:26.6385497Z STAGE:2022-11-23 01:51:32 23416:23416 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6385845Z STAGE:2022-11-23 01:51:32 23415:23415 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6386176Z STAGE:2022-11-23 01:51:32 23415:23415 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6386502Z STAGE:2022-11-23 01:51:32 23416:23416 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6386838Z STAGE:2022-11-23 01:51:32 23415:23415 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6387399Z STAGE:2022-11-23 01:51:32 23416:23416 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:51:32 23415:23415 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6387419Z 2022-11-23T02:12:26.6387766Z STAGE:2022-11-23 01:51:32 23416:23416 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6387868Z ok (6.661s) 2022-11-23T02:12:26.6387887Z 2022-11-23T02:12:26.6388135Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6388246Z Ran 1 test in 6.661s 2022-11-23T02:12:26.6388264Z 2022-11-23T02:12:26.6388351Z OK 2022-11-23T02:12:26.6388370Z 2022-11-23T02:12:26.6388496Z Generating XML reports... 2022-11-23T02:12:26.6389170Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015126.xml 2022-11-23T02:12:26.6389565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6389745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6390128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6390319Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6390339Z 2022-11-23T02:12:26.6390433Z Running tests... 2022-11-23T02:12:26.6390697Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6391011Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6391353Z test_all_reduce_full_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6391581Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23529 2022-11-23T02:12:26.6391801Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23530 2022-11-23T02:12:26.6392168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6392344Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6392709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6392961Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6393330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6393500Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6393885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6394072Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6394318Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6394564Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6394963Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6395347Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6395580Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6395828Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.6396057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6396293Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.6396695Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6397088Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6397420Z STAGE:2022-11-23 01:51:39 23530:23530 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6397748Z STAGE:2022-11-23 01:51:39 23529:23529 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6398009Z [1669168299.567063] [08317a7e7676:23530:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6398241Z [1669168301.218072] [08317a7e7676:23530:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6398476Z [1669168301.218072] [08317a7e7676:23530:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6398747Z [1669168299.546272] [08317a7e7676:23529:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6398970Z [1669168301.170517] [08317a7e7676:23529:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6399204Z [1669168301.170517] [08317a7e7676:23529:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6399802Z STAGE:2022-11-23 01:51:41 23530:23530 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:51:41 23529:23529 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6399825Z 2022-11-23T02:12:26.6400399Z STAGE:2022-11-23 01:51:41 23530:23530 ActivityProfilerController.cpp:310] Completed Stage: Post ProcessingSTAGE:2022-11-23 01:51:41 23529:23529 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6400419Z 2022-11-23T02:12:26.6400748Z STAGE:2022-11-23 01:51:41 23530:23530 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6401070Z STAGE:2022-11-23 01:51:41 23529:23529 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6401398Z STAGE:2022-11-23 01:51:41 23530:23530 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6401763Z STAGE:2022-11-23 01:51:41 23529:23529 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6402104Z STAGE:2022-11-23 01:51:41 23530:23530 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6402448Z STAGE:2022-11-23 01:51:41 23529:23529 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6402541Z ok (6.676s) 2022-11-23T02:12:26.6402560Z 2022-11-23T02:12:26.6402821Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6402930Z Ran 1 test in 6.676s 2022-11-23T02:12:26.6402949Z 2022-11-23T02:12:26.6403030Z OK 2022-11-23T02:12:26.6403049Z 2022-11-23T02:12:26.6403165Z Generating XML reports... 2022-11-23T02:12:26.6403610Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015135.xml 2022-11-23T02:12:26.6403972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6404147Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6404526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6404715Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6404735Z 2022-11-23T02:12:26.6404835Z Running tests... 2022-11-23T02:12:26.6405090Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6405398Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6405672Z test_all_reduce_full_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6405877Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23643 2022-11-23T02:12:26.6406090Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23644 2022-11-23T02:12:26.6406461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6406629Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6407009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6407196Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6407557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6407725Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6408102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6408274Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6408520Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6408761Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6409205Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6409615Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6409842Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6410077Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.6410299Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6410528Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.6410986Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6411377Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6411707Z STAGE:2022-11-23 01:51:48 23643:23643 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6412028Z STAGE:2022-11-23 01:51:48 23644:23644 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6412301Z [1669168308.782443] [08317a7e7676:23643:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6412525Z [1669168310.436931] [08317a7e7676:23643:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6412757Z [1669168310.436931] [08317a7e7676:23643:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6413028Z [1669168308.784166] [08317a7e7676:23644:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6413254Z [1669168310.409638] [08317a7e7676:23644:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6413485Z [1669168310.409638] [08317a7e7676:23644:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6414025Z STAGE:2022-11-23 01:51:50 23643:23643 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:51:50 23644:23644 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6414054Z 2022-11-23T02:12:26.6414389Z STAGE:2022-11-23 01:51:50 23644:23644 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6414739Z STAGE:2022-11-23 01:51:50 23643:23643 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6415069Z STAGE:2022-11-23 01:51:50 23643:23643 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6415393Z STAGE:2022-11-23 01:51:50 23644:23644 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6415727Z STAGE:2022-11-23 01:51:50 23643:23643 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6416057Z STAGE:2022-11-23 01:51:50 23644:23644 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6416403Z STAGE:2022-11-23 01:51:50 23643:23643 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6416741Z STAGE:2022-11-23 01:51:50 23644:23644 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6416835Z ok (6.651s) 2022-11-23T02:12:26.6416854Z 2022-11-23T02:12:26.6417101Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6417214Z Ran 1 test in 6.651s 2022-11-23T02:12:26.6417234Z 2022-11-23T02:12:26.6417322Z OK 2022-11-23T02:12:26.6417341Z 2022-11-23T02:12:26.6417465Z Generating XML reports... 2022-11-23T02:12:26.6417966Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015144.xml 2022-11-23T02:12:26.6418351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6418531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6418910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6419086Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6419117Z 2022-11-23T02:12:26.6419207Z Running tests... 2022-11-23T02:12:26.6419466Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6419824Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6420088Z test_all_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6420305Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23757 2022-11-23T02:12:26.6420520Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23758 2022-11-23T02:12:26.6420893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6421066Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6421431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6421622Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6421994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6422163Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6422549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6422741Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6422988Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6423228Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6423615Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6424014Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6424248Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6424486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.6424706Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6424939Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.6425334Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6425727Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6426057Z STAGE:2022-11-23 01:51:57 23758:23758 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6426368Z STAGE:2022-11-23 01:51:57 23757:23757 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6426646Z [1669168317.982245] [08317a7e7676:23758:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6426919Z [1669168319.600311] [08317a7e7676:23758:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6427157Z [1669168319.600311] [08317a7e7676:23758:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6427431Z [1669168317.961655] [08317a7e7676:23757:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6427652Z [1669168319.587278] [08317a7e7676:23757:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6427877Z [1669168319.587278] [08317a7e7676:23757:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6428483Z STAGE:2022-11-23 01:51:59 23758:23758 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:51:59 23757:23757 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6428504Z 2022-11-23T02:12:26.6428853Z STAGE:2022-11-23 01:51:59 23758:23758 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6429417Z STAGE:2022-11-23 01:51:59 23757:23757 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6429746Z STAGE:2022-11-23 01:52:00 23758:23758 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6430052Z STAGE:2022-11-23 01:52:00 23757:23757 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6430378Z STAGE:2022-11-23 01:52:00 23758:23758 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6430698Z STAGE:2022-11-23 01:52:00 23757:23757 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6431040Z STAGE:2022-11-23 01:52:00 23758:23758 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6431382Z STAGE:2022-11-23 01:52:00 23757:23757 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6431478Z ok (6.568s) 2022-11-23T02:12:26.6431496Z 2022-11-23T02:12:26.6431756Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6431862Z Ran 1 test in 6.568s 2022-11-23T02:12:26.6431881Z 2022-11-23T02:12:26.6431964Z OK 2022-11-23T02:12:26.6431984Z 2022-11-23T02:12:26.6432090Z Generating XML reports... 2022-11-23T02:12:26.6432539Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015154.xml 2022-11-23T02:12:26.6432911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6433086Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6433467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6433657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6433676Z 2022-11-23T02:12:26.6433776Z Running tests... 2022-11-23T02:12:26.6434033Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6434330Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6434587Z test_all_reduce_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6434805Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23871 2022-11-23T02:12:26.6435014Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23872 2022-11-23T02:12:26.6435382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6435554Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6436004Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6436202Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6436568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6436723Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6437099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6437288Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6437526Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6437828Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6438229Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6438629Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6438858Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6439082Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6439224Z skip: Skipped due to small world size. (4.260s) 2022-11-23T02:12:26.6439244Z 2022-11-23T02:12:26.6439509Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6439613Z Ran 1 test in 4.261s 2022-11-23T02:12:26.6439632Z 2022-11-23T02:12:26.6439733Z OK (skipped=1) 2022-11-23T02:12:26.6439752Z 2022-11-23T02:12:26.6439866Z Generating XML reports... 2022-11-23T02:12:26.6440313Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015203.xml 2022-11-23T02:12:26.6440684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6440853Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6441220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6441407Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6441427Z 2022-11-23T02:12:26.6441528Z Running tests... 2022-11-23T02:12:26.6441784Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6442089Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6442354Z test_all_reduce_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6442568Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 23974 2022-11-23T02:12:26.6442786Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 23975 2022-11-23T02:12:26.6443159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6443318Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6443694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6443878Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6444239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6444411Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6444780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6444964Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6445251Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6445488Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6445886Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6446278Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6446505Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6446781Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6446932Z skip: Skipped due to small world size. (4.254s) 2022-11-23T02:12:26.6446952Z 2022-11-23T02:12:26.6447214Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6447319Z Ran 1 test in 4.255s 2022-11-23T02:12:26.6447338Z 2022-11-23T02:12:26.6447441Z OK (skipped=1) 2022-11-23T02:12:26.6447460Z 2022-11-23T02:12:26.6447566Z Generating XML reports... 2022-11-23T02:12:26.6448008Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015210.xml 2022-11-23T02:12:26.6448383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6448557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6448931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6449126Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6449145Z 2022-11-23T02:12:26.6449251Z Running tests... 2022-11-23T02:12:26.6449508Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6449825Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6450080Z test_all_reduce_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6450301Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24077 2022-11-23T02:12:26.6450513Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24078 2022-11-23T02:12:26.6450879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6451051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6451434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6451618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6451986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6452142Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6452516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6452703Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6452948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6453187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6453587Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6453980Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6454254Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6454483Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6454624Z skip: Skipped due to small world size. (4.270s) 2022-11-23T02:12:26.6454643Z 2022-11-23T02:12:26.6454902Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6455006Z Ran 1 test in 4.270s 2022-11-23T02:12:26.6455026Z 2022-11-23T02:12:26.6455124Z OK (skipped=1) 2022-11-23T02:12:26.6455143Z 2022-11-23T02:12:26.6455263Z Generating XML reports... 2022-11-23T02:12:26.6455708Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015216.xml 2022-11-23T02:12:26.6456128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6456302Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6456687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6456864Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6456883Z 2022-11-23T02:12:26.6456989Z Running tests... 2022-11-23T02:12:26.6457251Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6457562Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6457825Z test_all_reduce_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6458046Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24180 2022-11-23T02:12:26.6458262Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24181 2022-11-23T02:12:26.6458642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6458801Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6459180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6459364Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6459729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6459896Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6460273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6460465Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6460708Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6460960Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6461347Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6461744Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6461974Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6462203Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6462355Z skip: Skipped due to small world size. (4.270s) 2022-11-23T02:12:26.6462378Z 2022-11-23T02:12:26.6462638Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6462745Z Ran 1 test in 4.270s 2022-11-23T02:12:26.6462764Z 2022-11-23T02:12:26.6462868Z OK (skipped=1) 2022-11-23T02:12:26.6462887Z 2022-11-23T02:12:26.6463043Z Generating XML reports... 2022-11-23T02:12:26.6463498Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015223.xml 2022-11-23T02:12:26.6463877Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6464051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6464426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6464613Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6464633Z 2022-11-23T02:12:26.6464786Z Running tests... 2022-11-23T02:12:26.6465051Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6465363Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6465604Z test_all_reduce_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6465824Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24283 2022-11-23T02:12:26.6466040Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24284 2022-11-23T02:12:26.6466407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6466583Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6466964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6467148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6467518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6467674Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6468055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6468241Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6468488Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6468726Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6469333Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6469745Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6469982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6470329Z STAGE:2022-11-23 01:52:34 24284:24284 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6470544Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6470872Z STAGE:2022-11-23 01:52:34 24283:24283 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6471146Z [1669168354.619254] [08317a7e7676:24284:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6471373Z [1669168356.240441] [08317a7e7676:24284:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6471607Z [1669168356.240441] [08317a7e7676:24284:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6471880Z [1669168354.597587] [08317a7e7676:24283:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6472227Z [1669168356.275098] [08317a7e7676:24283:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6472476Z [1669168356.275098] [08317a7e7676:24283:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6473031Z STAGE:2022-11-23 01:52:36 24284:24284 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:52:36 24283:24283 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6473053Z 2022-11-23T02:12:26.6473406Z STAGE:2022-11-23 01:52:36 24284:24284 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6473757Z STAGE:2022-11-23 01:52:36 24283:24283 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6474149Z STAGE:2022-11-23 01:52:36 24284:24284 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6474472Z STAGE:2022-11-23 01:52:36 24283:24283 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6474809Z STAGE:2022-11-23 01:52:36 24284:24284 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6475142Z STAGE:2022-11-23 01:52:36 24283:24283 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6475530Z STAGE:2022-11-23 01:52:36 24284:24284 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6475880Z STAGE:2022-11-23 01:52:36 24283:24283 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6475980Z ok (6.699s) 2022-11-23T02:12:26.6475999Z 2022-11-23T02:12:26.6476257Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6476368Z Ran 1 test in 6.699s 2022-11-23T02:12:26.6476388Z 2022-11-23T02:12:26.6476461Z OK 2022-11-23T02:12:26.6476480Z 2022-11-23T02:12:26.6476602Z Generating XML reports... 2022-11-23T02:12:26.6477048Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015230.xml 2022-11-23T02:12:26.6477424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6477597Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6477973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6478162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6478181Z 2022-11-23T02:12:26.6478283Z Running tests... 2022-11-23T02:12:26.6478529Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6478848Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6479106Z test_all_reduce_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6479326Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24397 2022-11-23T02:12:26.6479537Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24398 2022-11-23T02:12:26.6479914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6480091Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6480464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6480653Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6481005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6481182Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6481602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6481798Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6482041Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6482289Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6482699Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6483097Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6483370Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6483696Z STAGE:2022-11-23 01:52:43 24398:24398 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6483924Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6484246Z STAGE:2022-11-23 01:52:43 24397:24397 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6484519Z [1669168363.871317] [08317a7e7676:24398:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6484749Z [1669168365.521848] [08317a7e7676:24398:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6484978Z [1669168365.521848] [08317a7e7676:24398:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6485255Z [1669168363.849958] [08317a7e7676:24397:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6485483Z [1669168365.524643] [08317a7e7676:24397:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6485718Z [1669168365.524643] [08317a7e7676:24397:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6486278Z STAGE:2022-11-23 01:52:45 24398:24398 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:52:45 24397:24397 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6486300Z 2022-11-23T02:12:26.6486635Z STAGE:2022-11-23 01:52:45 24398:24398 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6486981Z STAGE:2022-11-23 01:52:45 24397:24397 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6487306Z STAGE:2022-11-23 01:52:46 24398:24398 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6487637Z STAGE:2022-11-23 01:52:46 24397:24397 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6487975Z STAGE:2022-11-23 01:52:46 24398:24398 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6488534Z STAGE:2022-11-23 01:52:46 24397:24397 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:52:46 24398:24398 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6488555Z 2022-11-23T02:12:26.6488899Z STAGE:2022-11-23 01:52:46 24397:24397 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6489039Z ok (6.662s) 2022-11-23T02:12:26.6489058Z 2022-11-23T02:12:26.6489316Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6489427Z Ran 1 test in 6.662s 2022-11-23T02:12:26.6489451Z 2022-11-23T02:12:26.6489525Z OK 2022-11-23T02:12:26.6489543Z 2022-11-23T02:12:26.6489665Z Generating XML reports... 2022-11-23T02:12:26.6490109Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015239.xml 2022-11-23T02:12:26.6490536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6490716Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6491098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6491287Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6491307Z 2022-11-23T02:12:26.6491414Z Running tests... 2022-11-23T02:12:26.6491663Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6491970Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6492284Z test_all_reduce_multigpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6492498Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24511 2022-11-23T02:12:26.6492719Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24512 2022-11-23T02:12:26.6493096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6493265Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6493648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6493836Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6494188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6494360Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6494728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6494917Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6495166Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6495411Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6495818Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6496209Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6496442Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6496662Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6497004Z STAGE:2022-11-23 01:52:54 24511:24511 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6497780Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T02:12:26.6497891Z warnings.warn( 2022-11-23T02:12:26.6498220Z STAGE:2022-11-23 01:52:54 24512:24512 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6498990Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T02:12:26.6499102Z warnings.warn( 2022-11-23T02:12:26.6499375Z [1669168374.810175] [08317a7e7676:24512:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6499657Z [1669168374.824854] [08317a7e7676:24512:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6499901Z [1669168374.824854] [08317a7e7676:24512:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6500233Z STAGE:2022-11-23 01:52:55 24512:24512 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6500500Z [1669168374.804249] [08317a7e7676:24511:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6500730Z [1669168374.819394] [08317a7e7676:24511:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6501015Z [1669168374.819394] [08317a7e7676:24511:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6501361Z STAGE:2022-11-23 01:52:55 24511:24511 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6501706Z STAGE:2022-11-23 01:52:55 24512:24512 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6502055Z STAGE:2022-11-23 01:52:55 24511:24511 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6502379Z STAGE:2022-11-23 01:52:55 24511:24511 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6502713Z STAGE:2022-11-23 01:52:55 24511:24511 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6503042Z STAGE:2022-11-23 01:52:55 24511:24511 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6503377Z STAGE:2022-11-23 01:52:55 24512:24512 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6503749Z STAGE:2022-11-23 01:52:55 24512:24512 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6504096Z STAGE:2022-11-23 01:52:55 24512:24512 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6504198Z ok (6.858s) 2022-11-23T02:12:26.6504218Z 2022-11-23T02:12:26.6504482Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6504587Z Ran 1 test in 6.859s 2022-11-23T02:12:26.6504607Z 2022-11-23T02:12:26.6504695Z OK 2022-11-23T02:12:26.6504714Z 2022-11-23T02:12:26.6504836Z Generating XML reports... 2022-11-23T02:12:26.6505273Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015249.xml 2022-11-23T02:12:26.6505642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6505820Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6506197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6506387Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6506407Z 2022-11-23T02:12:26.6506515Z Running tests... 2022-11-23T02:12:26.6506778Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6507092Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6507352Z test_all_reduce_multigpu_complex (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6507571Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24629 2022-11-23T02:12:26.6507789Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24630 2022-11-23T02:12:26.6508171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6508345Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6508768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6509263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6509656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6509824Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6510185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6510372Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6510616Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6510945Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6511354Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6511758Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6511990Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6512222Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6512557Z STAGE:2022-11-23 01:53:03 24629:24629 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6513320Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T02:12:26.6513433Z warnings.warn( 2022-11-23T02:12:26.6513763Z STAGE:2022-11-23 01:53:04 24630:24630 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6514533Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1506: UserWarning: torch.distributed.all_reduce_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T02:12:26.6514641Z warnings.warn( 2022-11-23T02:12:26.6514919Z [1669168384.103129] [08317a7e7676:24630:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6515146Z [1669168384.117511] [08317a7e7676:24630:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6515388Z [1669168384.117511] [08317a7e7676:24630:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6515660Z [1669168384.094360] [08317a7e7676:24629:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6515888Z [1669168384.109149] [08317a7e7676:24629:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6516119Z [1669168384.109149] [08317a7e7676:24629:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6516663Z STAGE:2022-11-23 01:53:04 24630:24630 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:53:04 24629:24629 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6516696Z 2022-11-23T02:12:26.6517037Z STAGE:2022-11-23 01:53:04 24630:24630 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6517388Z STAGE:2022-11-23 01:53:04 24629:24629 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6517776Z STAGE:2022-11-23 01:53:04 24630:24630 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6518110Z STAGE:2022-11-23 01:53:04 24629:24629 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6518435Z STAGE:2022-11-23 01:53:04 24630:24630 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6518783Z STAGE:2022-11-23 01:53:04 24630:24630 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6519114Z STAGE:2022-11-23 01:53:04 24629:24629 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6519455Z STAGE:2022-11-23 01:53:04 24629:24629 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6519587Z ok (6.761s) 2022-11-23T02:12:26.6519622Z 2022-11-23T02:12:26.6519872Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6519983Z Ran 1 test in 6.761s 2022-11-23T02:12:26.6520002Z 2022-11-23T02:12:26.6520090Z OK 2022-11-23T02:12:26.6520109Z 2022-11-23T02:12:26.6520235Z Generating XML reports... 2022-11-23T02:12:26.6520690Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015258.xml 2022-11-23T02:12:26.6521068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6521239Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6521624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6521803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6521826Z 2022-11-23T02:12:26.6521932Z Running tests... 2022-11-23T02:12:26.6522191Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6522504Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6522769Z test_all_reduce_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6522988Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24747 2022-11-23T02:12:26.6523205Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24748 2022-11-23T02:12:26.6523579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6523740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6524127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6524320Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6524689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6524862Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6525245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6525434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6525684Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6525933Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6526321Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6526721Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6526956Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6527346Z STAGE:2022-11-23 01:53:11 24747:24747 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6527576Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6527908Z STAGE:2022-11-23 01:53:11 24748:24748 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6528183Z [1669168391.731831] [08317a7e7676:24747:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6528412Z [1669168393.383848] [08317a7e7676:24747:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6528642Z [1669168393.383848] [08317a7e7676:24747:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6528945Z [1669168391.734487] [08317a7e7676:24748:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6529180Z [1669168393.393833] [08317a7e7676:24748:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6529415Z [1669168393.393833] [08317a7e7676:24748:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6529977Z STAGE:2022-11-23 01:53:13 24747:24747 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:53:13 24748:24748 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6529997Z 2022-11-23T02:12:26.6530351Z STAGE:2022-11-23 01:53:13 24747:24747 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6530703Z STAGE:2022-11-23 01:53:13 24748:24748 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6531024Z STAGE:2022-11-23 01:53:13 24747:24747 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6531353Z STAGE:2022-11-23 01:53:13 24748:24748 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6531689Z STAGE:2022-11-23 01:53:13 24747:24747 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6532244Z STAGE:2022-11-23 01:53:13 24748:24748 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:53:13 24747:24747 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6532263Z 2022-11-23T02:12:26.6532609Z STAGE:2022-11-23 01:53:13 24748:24748 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6532694Z ok (6.624s) 2022-11-23T02:12:26.6532725Z 2022-11-23T02:12:26.6532980Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6533089Z Ran 1 test in 6.624s 2022-11-23T02:12:26.6533108Z 2022-11-23T02:12:26.6533196Z OK 2022-11-23T02:12:26.6533215Z 2022-11-23T02:12:26.6533335Z Generating XML reports... 2022-11-23T02:12:26.6533793Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015307.xml 2022-11-23T02:12:26.6534169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6534341Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6534727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6534905Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6534923Z 2022-11-23T02:12:26.6535029Z Running tests... 2022-11-23T02:12:26.6535294Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6535611Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6535881Z test_all_reduce_result_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6536148Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24861 2022-11-23T02:12:26.6536378Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24862 2022-11-23T02:12:26.6536758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6536917Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6537294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6537482Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6537914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6538086Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6538470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6538657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6538903Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6539152Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6539539Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6539935Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6540166Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6540397Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6540675Z [1669168402.254542] [08317a7e7676:24861:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6540899Z [1669168402.267471] [08317a7e7676:24861:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6541136Z [1669168402.267471] [08317a7e7676:24861:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6541407Z [1669168402.256129] [08317a7e7676:24862:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6541625Z [1669168402.269552] [08317a7e7676:24862:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6541861Z [1669168402.269552] [08317a7e7676:24862:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6541946Z ok (6.119s) 2022-11-23T02:12:26.6541968Z 2022-11-23T02:12:26.6542237Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6542342Z Ran 1 test in 6.119s 2022-11-23T02:12:26.6542361Z 2022-11-23T02:12:26.6542447Z OK 2022-11-23T02:12:26.6542467Z 2022-11-23T02:12:26.6542588Z Generating XML reports... 2022-11-23T02:12:26.6543036Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015317.xml 2022-11-23T02:12:26.6543410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6543586Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6543956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6544147Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6544166Z 2022-11-23T02:12:26.6544321Z Running tests... 2022-11-23T02:12:26.6544592Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6544899Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6545156Z test_all_reduce_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6545380Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 24975 2022-11-23T02:12:26.6545593Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 24976 2022-11-23T02:12:26.6545969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6546178Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6546566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6546760Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6547121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6547293Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6547667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6547852Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6548095Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6548326Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6548732Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6549356Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6549598Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6549945Z STAGE:2022-11-23 01:53:29 24976:24976 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6550176Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6550508Z STAGE:2022-11-23 01:53:29 24975:24975 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6550780Z [1669168409.630164] [08317a7e7676:24976:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6551008Z [1669168411.261005] [08317a7e7676:24976:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6551246Z [1669168411.261005] [08317a7e7676:24976:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6551501Z [1669168409.608008] [08317a7e7676:24975:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6551728Z [1669168411.276424] [08317a7e7676:24975:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6551965Z [1669168411.276424] [08317a7e7676:24975:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6552522Z STAGE:2022-11-23 01:53:31 24976:24976 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:53:31 24975:24975 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6552546Z 2022-11-23T02:12:26.6552897Z STAGE:2022-11-23 01:53:31 24976:24976 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6553316Z STAGE:2022-11-23 01:53:31 24975:24975 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6553661Z STAGE:2022-11-23 01:53:31 24976:24976 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6553986Z STAGE:2022-11-23 01:53:31 24975:24975 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6554316Z STAGE:2022-11-23 01:53:31 24976:24976 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6554663Z STAGE:2022-11-23 01:53:31 24976:24976 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6554981Z STAGE:2022-11-23 01:53:31 24975:24975 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6555391Z STAGE:2022-11-23 01:53:31 24975:24975 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6555491Z ok (6.656s) 2022-11-23T02:12:26.6555510Z 2022-11-23T02:12:26.6555772Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6555883Z Ran 1 test in 6.656s 2022-11-23T02:12:26.6555903Z 2022-11-23T02:12:26.6555987Z OK 2022-11-23T02:12:26.6556006Z 2022-11-23T02:12:26.6556130Z Generating XML reports... 2022-11-23T02:12:26.6556582Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015325.xml 2022-11-23T02:12:26.6556956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6557115Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6557497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6557691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6557710Z 2022-11-23T02:12:26.6557813Z Running tests... 2022-11-23T02:12:26.6558075Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6558391Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6558650Z test_all_reduce_sum_async (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6558870Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25089 2022-11-23T02:12:26.6559073Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25090 2022-11-23T02:12:26.6559446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6559615Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6560006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6560197Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6560561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6560731Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6561106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6561294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6561527Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6561772Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6562175Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6562576Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6562850Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6563196Z STAGE:2022-11-23 01:53:38 25089:25089 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6563427Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6563762Z STAGE:2022-11-23 01:53:38 25090:25090 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6564032Z [1669168418.779702] [08317a7e7676:25089:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6564247Z [1669168420.419307] [08317a7e7676:25089:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6564534Z [1669168420.419307] [08317a7e7676:25089:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6564807Z [1669168418.800258] [08317a7e7676:25090:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6565028Z [1669168420.427579] [08317a7e7676:25090:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6565260Z [1669168420.427579] [08317a7e7676:25090:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6565816Z STAGE:2022-11-23 01:53:40 25089:25089 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:53:40 25090:25090 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6565839Z 2022-11-23T02:12:26.6566192Z STAGE:2022-11-23 01:53:40 25090:25090 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6566537Z STAGE:2022-11-23 01:53:40 25089:25089 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6566865Z STAGE:2022-11-23 01:53:40 25090:25090 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6567189Z STAGE:2022-11-23 01:53:40 25089:25089 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6567506Z STAGE:2022-11-23 01:53:40 25090:25090 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6567833Z STAGE:2022-11-23 01:53:40 25089:25089 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6568175Z STAGE:2022-11-23 01:53:40 25090:25090 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6568521Z STAGE:2022-11-23 01:53:40 25089:25089 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6568626Z ok (6.559s) 2022-11-23T02:12:26.6568645Z 2022-11-23T02:12:26.6568904Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6569014Z Ran 1 test in 6.559s 2022-11-23T02:12:26.6569034Z 2022-11-23T02:12:26.6569122Z OK 2022-11-23T02:12:26.6569141Z 2022-11-23T02:12:26.6569260Z Generating XML reports... 2022-11-23T02:12:26.6569699Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015334.xml 2022-11-23T02:12:26.6570076Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6570251Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6570631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6570822Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6570845Z 2022-11-23T02:12:26.6570947Z Running tests... 2022-11-23T02:12:26.6571210Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6571568Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6571824Z test_all_reduce_sum_complex (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6572046Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 25203 2022-11-23T02:12:26.6572262Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 25204 2022-11-23T02:12:26.6572634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6572805Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6573188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6573425Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6573791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6573969Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6574332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6574520Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6574772Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6575018Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6575419Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6575820Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6576048Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6576388Z STAGE:2022-11-23 01:53:47 25203:25203 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6576621Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6576938Z STAGE:2022-11-23 01:53:47 25204:25204 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6577210Z [1669168427.913139] [08317a7e7676:25204:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6577435Z [1669168429.526214] [08317a7e7676:25204:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6577672Z [1669168429.526214] [08317a7e7676:25204:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6577940Z [1669168427.890987] [08317a7e7676:25203:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6578168Z [1669168429.524429] [08317a7e7676:25203:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6578397Z [1669168429.524429] [08317a7e7676:25203:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6578957Z STAGE:2022-11-23 01:53:49 25204:25204 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:53:49 25203:25203 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6578977Z 2022-11-23T02:12:26.6579326Z STAGE:2022-11-23 01:53:49 25204:25204 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6579674Z STAGE:2022-11-23 01:53:49 25203:25203 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6579989Z STAGE:2022-11-23 01:53:50 25204:25204 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6580360Z STAGE:2022-11-23 01:53:50 25203:25203 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6580704Z STAGE:2022-11-23 01:53:50 25204:25204 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6581030Z STAGE:2022-11-23 01:53:50 25203:25203 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6581378Z STAGE:2022-11-23 01:53:50 25204:25204 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6581718Z STAGE:2022-11-23 01:53:50 25203:25203 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6581819Z ok (6.554s) 2022-11-23T02:12:26.6581884Z 2022-11-23T02:12:26.6582161Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6582273Z Ran 1 test in 6.554s 2022-11-23T02:12:26.6582292Z 2022-11-23T02:12:26.6582366Z OK 2022-11-23T02:12:26.6582385Z 2022-11-23T02:12:26.6582509Z Generating XML reports... 2022-11-23T02:12:26.6582966Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015344.xml 2022-11-23T02:12:26.6583341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6583519Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6583903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6584095Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6584114Z 2022-11-23T02:12:26.6584223Z Running tests... 2022-11-23T02:12:26.6584483Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6584780Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6585087Z test_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and NCCL backends will have CUDA allReduce tested (0.002s) 2022-11-23T02:12:26.6585107Z 2022-11-23T02:12:26.6585370Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6585479Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6585499Z 2022-11-23T02:12:26.6585607Z OK (skipped=1) 2022-11-23T02:12:26.6585625Z 2022-11-23T02:12:26.6585748Z Generating XML reports... 2022-11-23T02:12:26.6586195Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015353.xml 2022-11-23T02:12:26.6586572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6586750Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6587115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6587311Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6587330Z 2022-11-23T02:12:26.6587435Z Running tests... 2022-11-23T02:12:26.6587694Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6588004Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6588314Z test_all_reduce_sum_cuda_async (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and NCCL backends will have CUDA allReduce tested (0.002s) 2022-11-23T02:12:26.6588334Z 2022-11-23T02:12:26.6588591Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6588702Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6588724Z 2022-11-23T02:12:26.6588813Z OK (skipped=1) 2022-11-23T02:12:26.6588850Z 2022-11-23T02:12:26.6589167Z Generating XML reports... 2022-11-23T02:12:26.6589661Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015355.xml 2022-11-23T02:12:26.6590115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6590298Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6590681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6590871Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6590891Z 2022-11-23T02:12:26.6590997Z Running tests... 2022-11-23T02:12:26.6591259Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6591555Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6591929Z test_all_reduce_sum_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and NCCL backends will have CUDA allReduce tested (0.002s) 2022-11-23T02:12:26.6591949Z 2022-11-23T02:12:26.6592215Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6592322Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6592341Z 2022-11-23T02:12:26.6592446Z OK (skipped=1) 2022-11-23T02:12:26.6592465Z 2022-11-23T02:12:26.6592587Z Generating XML reports... 2022-11-23T02:12:26.6593028Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015357.xml 2022-11-23T02:12:26.6593401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6593576Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6593940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6594135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6594154Z 2022-11-23T02:12:26.6594260Z Running tests... 2022-11-23T02:12:26.6594527Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6594843Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6595092Z test_all_to_all (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T02:12:26.6595111Z 2022-11-23T02:12:26.6595369Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6595477Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6595496Z 2022-11-23T02:12:26.6595602Z OK (skipped=1) 2022-11-23T02:12:26.6595621Z 2022-11-23T02:12:26.6595725Z Generating XML reports... 2022-11-23T02:12:26.6596166Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015400.xml 2022-11-23T02:12:26.6596546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6596724Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6597103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6597291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6597311Z 2022-11-23T02:12:26.6597416Z Running tests... 2022-11-23T02:12:26.6597677Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6597973Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6598234Z test_all_to_all_complex (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T02:12:26.6598257Z 2022-11-23T02:12:26.6598516Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6598626Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6598645Z 2022-11-23T02:12:26.6598752Z OK (skipped=1) 2022-11-23T02:12:26.6598771Z 2022-11-23T02:12:26.6598938Z Generating XML reports... 2022-11-23T02:12:26.6599389Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015402.xml 2022-11-23T02:12:26.6599761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6599937Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6600303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6600494Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6600513Z 2022-11-23T02:12:26.6600679Z Running tests... 2022-11-23T02:12:26.6600944Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6601257Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6601522Z test_all_to_all_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL supports CUDA all_to_all (0.002s) 2022-11-23T02:12:26.6601541Z 2022-11-23T02:12:26.6601799Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6601908Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6601927Z 2022-11-23T02:12:26.6602034Z OK (skipped=1) 2022-11-23T02:12:26.6602053Z 2022-11-23T02:12:26.6602158Z Generating XML reports... 2022-11-23T02:12:26.6602603Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015405.xml 2022-11-23T02:12:26.6602978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6603157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6603541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6603734Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6603753Z 2022-11-23T02:12:26.6603860Z Running tests... 2022-11-23T02:12:26.6604122Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6604418Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6604690Z test_all_to_all_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL supports CUDA all_to_all (0.002s) 2022-11-23T02:12:26.6604709Z 2022-11-23T02:12:26.6604968Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6605073Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6605096Z 2022-11-23T02:12:26.6605205Z OK (skipped=1) 2022-11-23T02:12:26.6605224Z 2022-11-23T02:12:26.6605345Z Generating XML reports... 2022-11-23T02:12:26.6605788Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015407.xml 2022-11-23T02:12:26.6606164Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6606345Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6606711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6606907Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6606926Z 2022-11-23T02:12:26.6607032Z Running tests... 2022-11-23T02:12:26.6607293Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6607605Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6607869Z test_all_to_all_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T02:12:26.6607888Z 2022-11-23T02:12:26.6608147Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6608304Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6608325Z 2022-11-23T02:12:26.6608432Z OK (skipped=1) 2022-11-23T02:12:26.6608451Z 2022-11-23T02:12:26.6608556Z Generating XML reports... 2022-11-23T02:12:26.6609001Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015410.xml 2022-11-23T02:12:26.6609374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6609548Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6609930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6610222Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6610241Z 2022-11-23T02:12:26.6610349Z Running tests... 2022-11-23T02:12:26.6610611Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6610927Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6611186Z test_all_to_all_full_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL supports CUDA all_to_all (0.002s) 2022-11-23T02:12:26.6611205Z 2022-11-23T02:12:26.6611470Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6611578Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6611597Z 2022-11-23T02:12:26.6611702Z OK (skipped=1) 2022-11-23T02:12:26.6611721Z 2022-11-23T02:12:26.6611841Z Generating XML reports... 2022-11-23T02:12:26.6612289Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015412.xml 2022-11-23T02:12:26.6612667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6612843Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6613209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6613402Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6613421Z 2022-11-23T02:12:26.6613528Z Running tests... 2022-11-23T02:12:26.6613788Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6614101Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6614355Z test_all_to_all_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports all_to_all (0.002s) 2022-11-23T02:12:26.6614375Z 2022-11-23T02:12:26.6614639Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6614750Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6614769Z 2022-11-23T02:12:26.6614875Z OK (skipped=1) 2022-11-23T02:12:26.6614893Z 2022-11-23T02:12:26.6614998Z Generating XML reports... 2022-11-23T02:12:26.6615447Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015414.xml 2022-11-23T02:12:26.6615821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6615997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6616375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6616566Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6616585Z 2022-11-23T02:12:26.6616692Z Running tests... 2022-11-23T02:12:26.6616957Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6617269Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6617573Z test_all_to_all_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:12:26.6617594Z 2022-11-23T02:12:26.6617858Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6617970Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6617989Z 2022-11-23T02:12:26.6618094Z OK (skipped=1) 2022-11-23T02:12:26.6618113Z 2022-11-23T02:12:26.6618233Z Generating XML reports... 2022-11-23T02:12:26.6618678Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015417.xml 2022-11-23T02:12:26.6619049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6619276Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6619660Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6619835Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6619858Z 2022-11-23T02:12:26.6619965Z Running tests... 2022-11-23T02:12:26.6620227Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6620539Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6620827Z test_all_to_all_single_equal_split (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:12:26.6620847Z 2022-11-23T02:12:26.6621107Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6621216Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6621234Z 2022-11-23T02:12:26.6621342Z OK (skipped=1) 2022-11-23T02:12:26.6621364Z 2022-11-23T02:12:26.6621467Z Generating XML reports... 2022-11-23T02:12:26.6621910Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015419.xml 2022-11-23T02:12:26.6622285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6622460Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6622844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6623037Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6623056Z 2022-11-23T02:12:26.6623163Z Running tests... 2022-11-23T02:12:26.6623417Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6623728Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6624012Z test_all_to_all_single_equal_split_complex (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:12:26.6624049Z 2022-11-23T02:12:26.6624293Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6624407Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6624426Z 2022-11-23T02:12:26.6624530Z OK (skipped=1) 2022-11-23T02:12:26.6624548Z 2022-11-23T02:12:26.6624669Z Generating XML reports... 2022-11-23T02:12:26.6625114Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015422.xml 2022-11-23T02:12:26.6625488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6625663Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6626044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6626220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6626256Z 2022-11-23T02:12:26.6626344Z Running tests... 2022-11-23T02:12:26.6626606Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6626962Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6627262Z test_all_to_all_single_equal_split_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:12:26.6627281Z 2022-11-23T02:12:26.6627541Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6627651Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6627670Z 2022-11-23T02:12:26.6627776Z OK (skipped=1) 2022-11-23T02:12:26.6627795Z 2022-11-23T02:12:26.6627919Z Generating XML reports... 2022-11-23T02:12:26.6628348Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015424.xml 2022-11-23T02:12:26.6628780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6629174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6629574Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6629768Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6629789Z 2022-11-23T02:12:26.6629896Z Running tests... 2022-11-23T02:12:26.6630158Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6630472Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6630761Z test_all_to_all_single_equal_split_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:12:26.6630802Z 2022-11-23T02:12:26.6631043Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6631155Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6631174Z 2022-11-23T02:12:26.6631279Z OK (skipped=1) 2022-11-23T02:12:26.6631297Z 2022-11-23T02:12:26.6631422Z Generating XML reports... 2022-11-23T02:12:26.6631863Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015426.xml 2022-11-23T02:12:26.6632234Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6632408Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6632787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6632961Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6632998Z 2022-11-23T02:12:26.6633091Z Running tests... 2022-11-23T02:12:26.6633349Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6633660Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6633962Z test_all_to_all_single_equal_split_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:12:26.6633981Z 2022-11-23T02:12:26.6634235Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6634347Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6634367Z 2022-11-23T02:12:26.6634473Z OK (skipped=1) 2022-11-23T02:12:26.6634492Z 2022-11-23T02:12:26.6634612Z Generating XML reports... 2022-11-23T02:12:26.6635042Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015429.xml 2022-11-23T02:12:26.6635418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6635595Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6635983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6636247Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6636269Z 2022-11-23T02:12:26.6636382Z Running tests... 2022-11-23T02:12:26.6636648Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6636961Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6637272Z test_all_to_all_single_equal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:12:26.6637292Z 2022-11-23T02:12:26.6637533Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6637642Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6637720Z 2022-11-23T02:12:26.6637830Z OK (skipped=1) 2022-11-23T02:12:26.6637849Z 2022-11-23T02:12:26.6637970Z Generating XML reports... 2022-11-23T02:12:26.6638417Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015431.xml 2022-11-23T02:12:26.6638795Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6638976Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6639359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6639550Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6639569Z 2022-11-23T02:12:26.6639658Z Running tests... 2022-11-23T02:12:26.6639919Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6640235Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6640533Z test_all_to_all_single_equal_split_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:12:26.6640552Z 2022-11-23T02:12:26.6640809Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6640920Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6640939Z 2022-11-23T02:12:26.6641046Z OK (skipped=1) 2022-11-23T02:12:26.6641065Z 2022-11-23T02:12:26.6641185Z Generating XML reports... 2022-11-23T02:12:26.6641630Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015434.xml 2022-11-23T02:12:26.6641989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6642164Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6642545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6642740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6642760Z 2022-11-23T02:12:26.6642866Z Running tests... 2022-11-23T02:12:26.6643134Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6643449Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6643751Z test_all_to_all_single_equal_split_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:12:26.6643771Z 2022-11-23T02:12:26.6644030Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6644122Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6644141Z 2022-11-23T02:12:26.6644243Z OK (skipped=1) 2022-11-23T02:12:26.6644262Z 2022-11-23T02:12:26.6644384Z Generating XML reports... 2022-11-23T02:12:26.6644834Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015436.xml 2022-11-23T02:12:26.6645207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6645436Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6645828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6646020Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6646040Z 2022-11-23T02:12:26.6646128Z Running tests... 2022-11-23T02:12:26.6646390Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6646704Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6646994Z test_all_to_all_single_unequal_split (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:12:26.6647059Z 2022-11-23T02:12:26.6647325Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6647432Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6647451Z 2022-11-23T02:12:26.6647558Z OK (skipped=1) 2022-11-23T02:12:26.6647580Z 2022-11-23T02:12:26.6647705Z Generating XML reports... 2022-11-23T02:12:26.6648147Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015438.xml 2022-11-23T02:12:26.6648504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6648682Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6649066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6649258Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6649280Z 2022-11-23T02:12:26.6649388Z Running tests... 2022-11-23T02:12:26.6649652Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6649967Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6650272Z test_all_to_all_single_unequal_split_complex (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:12:26.6650291Z 2022-11-23T02:12:26.6650555Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6650648Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6650667Z 2022-11-23T02:12:26.6650773Z OK (skipped=1) 2022-11-23T02:12:26.6650793Z 2022-11-23T02:12:26.6650916Z Generating XML reports... 2022-11-23T02:12:26.6651363Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015441.xml 2022-11-23T02:12:26.6651742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6651919Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6652306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6652503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6652522Z 2022-11-23T02:12:26.6652629Z Running tests... 2022-11-23T02:12:26.6652875Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6653187Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6653489Z test_all_to_all_single_unequal_split_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:12:26.6653508Z 2022-11-23T02:12:26.6653764Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6653880Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6653899Z 2022-11-23T02:12:26.6654007Z OK (skipped=1) 2022-11-23T02:12:26.6654025Z 2022-11-23T02:12:26.6654149Z Generating XML reports... 2022-11-23T02:12:26.6654639Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015443.xml 2022-11-23T02:12:26.6655006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6655182Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6655562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6655752Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6655770Z 2022-11-23T02:12:26.6655877Z Running tests... 2022-11-23T02:12:26.6656141Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6656505Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6656821Z test_all_to_all_single_unequal_split_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:12:26.6656841Z 2022-11-23T02:12:26.6657103Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6657196Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6657235Z 2022-11-23T02:12:26.6657323Z OK (skipped=1) 2022-11-23T02:12:26.6657342Z 2022-11-23T02:12:26.6657466Z Generating XML reports... 2022-11-23T02:12:26.6657911Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015446.xml 2022-11-23T02:12:26.6658286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6658461Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6658844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6659040Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6659058Z 2022-11-23T02:12:26.6659169Z Running tests... 2022-11-23T02:12:26.6659415Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6659730Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6660034Z test_all_to_all_single_unequal_split_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:12:26.6660054Z 2022-11-23T02:12:26.6660309Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6660419Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6660438Z 2022-11-23T02:12:26.6660546Z OK (skipped=1) 2022-11-23T02:12:26.6660568Z 2022-11-23T02:12:26.6660692Z Generating XML reports... 2022-11-23T02:12:26.6661141Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015448.xml 2022-11-23T02:12:26.6661520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6661680Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6662065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6662257Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6662276Z 2022-11-23T02:12:26.6662383Z Running tests... 2022-11-23T02:12:26.6662645Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6662960Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6663277Z test_all_to_all_single_unequal_split_full_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:12:26.6663296Z 2022-11-23T02:12:26.6663559Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6663720Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6663741Z 2022-11-23T02:12:26.6663832Z OK (skipped=1) 2022-11-23T02:12:26.6663851Z 2022-11-23T02:12:26.6663979Z Generating XML reports... 2022-11-23T02:12:26.6664424Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015450.xml 2022-11-23T02:12:26.6664795Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6664971Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6665353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6665607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6665627Z 2022-11-23T02:12:26.6665737Z Running tests... 2022-11-23T02:12:26.6665982Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6666299Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6666601Z test_all_to_all_single_unequal_split_group (__main__.TestDistBackendWithSpawn) ... skip: Only MPI supports CPU all_to_all_single (0.002s) 2022-11-23T02:12:26.6666621Z 2022-11-23T02:12:26.6666881Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6666989Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6667009Z 2022-11-23T02:12:26.6667113Z OK (skipped=1) 2022-11-23T02:12:26.6667131Z 2022-11-23T02:12:26.6667256Z Generating XML reports... 2022-11-23T02:12:26.6667703Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015453.xml 2022-11-23T02:12:26.6668082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6668241Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6668628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6668823Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6668842Z 2022-11-23T02:12:26.6669156Z Running tests... 2022-11-23T02:12:26.6669431Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6669746Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6670056Z test_all_to_all_single_unequal_split_group_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA all_to_all_single (0.002s) 2022-11-23T02:12:26.6670080Z 2022-11-23T02:12:26.6670341Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6670450Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6670469Z 2022-11-23T02:12:26.6670556Z OK (skipped=1) 2022-11-23T02:12:26.6670593Z 2022-11-23T02:12:26.6670702Z Generating XML reports... 2022-11-23T02:12:26.6671148Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015455.xml 2022-11-23T02:12:26.6671521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6671697Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6672085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6672279Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6672298Z 2022-11-23T02:12:26.6672410Z Running tests... 2022-11-23T02:12:26.6672672Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6672966Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6673304Z test_average_parameters (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6673535Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26208 2022-11-23T02:12:26.6673756Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26209 2022-11-23T02:12:26.6674135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6674310Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6674695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6674950Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6675302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6675477Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6675867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6676058Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6676310Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6676559Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6676969Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6677377Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6677617Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6677834Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6678180Z [1669168504.241571] [08317a7e7676:26208:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6678451Z [1669168504.255242] [08317a7e7676:26208:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6678727Z [1669168504.255242] [08317a7e7676:26208:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6679034Z [1669168504.245396] [08317a7e7676:26209:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6679304Z [1669168504.258095] [08317a7e7676:26209:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6679556Z [1669168504.258095] [08317a7e7676:26209:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6679987Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.6680274Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.6680768Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6681157Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6681298Z ok (7.252s) 2022-11-23T02:12:26.6681318Z 2022-11-23T02:12:26.6681623Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6681776Z Ran 1 test in 7.252s 2022-11-23T02:12:26.6681796Z 2022-11-23T02:12:26.6681921Z OK 2022-11-23T02:12:26.6681940Z 2022-11-23T02:12:26.6682107Z Generating XML reports... 2022-11-23T02:12:26.6682700Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015458.xml 2022-11-23T02:12:26.6683134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6683350Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6683723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6683952Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6683973Z 2022-11-23T02:12:26.6684115Z Running tests... 2022-11-23T02:12:26.6684426Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6684833Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6685134Z test_backend_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6685432Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26332 2022-11-23T02:12:26.6685739Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26333 2022-11-23T02:12:26.6686100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6686314Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6686734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6686973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6687383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6687594Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6688011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6688275Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6688561Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6688796Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6689236Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6689734Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6690013Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6690282Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6690468Z skip: Need at least 3 CUDA devices (4.152s) 2022-11-23T02:12:26.6690492Z 2022-11-23T02:12:26.6690796Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6691035Z Ran 1 test in 4.153s 2022-11-23T02:12:26.6691055Z 2022-11-23T02:12:26.6691199Z OK (skipped=1) 2022-11-23T02:12:26.6691218Z 2022-11-23T02:12:26.6691325Z Generating XML reports... 2022-11-23T02:12:26.6691828Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015507.xml 2022-11-23T02:12:26.6692239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6692474Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6692900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6693128Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6693148Z 2022-11-23T02:12:26.6693345Z Running tests... 2022-11-23T02:12:26.6693707Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6694010Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6694305Z test_backend_group (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 3 (0.002s) 2022-11-23T02:12:26.6694326Z 2022-11-23T02:12:26.6694623Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6694768Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6694788Z 2022-11-23T02:12:26.6694928Z OK (skipped=1) 2022-11-23T02:12:26.6694947Z 2022-11-23T02:12:26.6695104Z Generating XML reports... 2022-11-23T02:12:26.6695653Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015514.xml 2022-11-23T02:12:26.6696124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6696378Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6696749Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6696979Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6696998Z 2022-11-23T02:12:26.6697142Z Running tests... 2022-11-23T02:12:26.6697441Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6697792Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6698075Z test_barrier (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support CPU barrier (0.002s) 2022-11-23T02:12:26.6698098Z 2022-11-23T02:12:26.6698405Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6698550Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6698569Z 2022-11-23T02:12:26.6698744Z OK (skipped=1) 2022-11-23T02:12:26.6698767Z 2022-11-23T02:12:26.6698874Z Generating XML reports... 2022-11-23T02:12:26.6699363Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015516.xml 2022-11-23T02:12:26.6699776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6699987Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6700407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6700647Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6700670Z 2022-11-23T02:12:26.6700862Z Running tests... 2022-11-23T02:12:26.6701160Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6701455Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6701782Z test_barrier_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6702040Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26501 2022-11-23T02:12:26.6702297Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26502 2022-11-23T02:12:26.6702711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6702932Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6703352Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6703584Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6703990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6704203Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6704665Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6704891Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6705185Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6705471Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6705914Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6706456Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6706726Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6706996Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6707259Z [1669168524.679374] [08317a7e7676:26501:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6707564Z [1669168524.692975] [08317a7e7676:26501:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6707848Z [1669168524.692975] [08317a7e7676:26501:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6708158Z [1669168524.686695] [08317a7e7676:26502:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6708429Z [1669168524.700133] [08317a7e7676:26502:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6708700Z [1669168524.700133] [08317a7e7676:26502:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6708838Z ok (6.980s) 2022-11-23T02:12:26.6708858Z 2022-11-23T02:12:26.6709418Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6709569Z Ran 1 test in 6.980s 2022-11-23T02:12:26.6709589Z 2022-11-23T02:12:26.6709664Z OK 2022-11-23T02:12:26.6709786Z 2022-11-23T02:12:26.6709897Z Generating XML reports... 2022-11-23T02:12:26.6710393Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015519.xml 2022-11-23T02:12:26.6710810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6711029Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6711497Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6711732Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6711751Z 2022-11-23T02:12:26.6711894Z Running tests... 2022-11-23T02:12:26.6712202Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6712503Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6712838Z test_barrier_full_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support CPU barrier (0.002s) 2022-11-23T02:12:26.6712859Z 2022-11-23T02:12:26.6713152Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6713298Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6713322Z 2022-11-23T02:12:26.6713463Z OK (skipped=1) 2022-11-23T02:12:26.6713482Z 2022-11-23T02:12:26.6713640Z Generating XML reports... 2022-11-23T02:12:26.6714127Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015528.xml 2022-11-23T02:12:26.6714632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6714856Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6715230Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6715495Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6715514Z 2022-11-23T02:12:26.6715656Z Running tests... 2022-11-23T02:12:26.6715958Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6716361Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6716751Z test_barrier_full_group_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6717013Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26648 2022-11-23T02:12:26.6717273Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26649 2022-11-23T02:12:26.6717638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6717852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6718305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6718535Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6718945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6732624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6733100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6733315Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6733572Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6733823Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6734236Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6734639Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6734870Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6735107Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6735265Z skip: Skipped due to small world size. (4.251s) 2022-11-23T02:12:26.6735286Z 2022-11-23T02:12:26.6735543Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6735657Z Ran 1 test in 4.252s 2022-11-23T02:12:26.6735676Z 2022-11-23T02:12:26.6735780Z OK (skipped=1) 2022-11-23T02:12:26.6735799Z 2022-11-23T02:12:26.6735920Z Generating XML reports... 2022-11-23T02:12:26.6736376Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015531.xml 2022-11-23T02:12:26.6736754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6736929Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6737312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6737504Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6737524Z 2022-11-23T02:12:26.6737615Z Running tests... 2022-11-23T02:12:26.6737972Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6738298Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6738559Z test_barrier_group (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support CPU barrier (0.002s) 2022-11-23T02:12:26.6738579Z 2022-11-23T02:12:26.6738841Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6738954Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6738973Z 2022-11-23T02:12:26.6739080Z OK (skipped=1) 2022-11-23T02:12:26.6739098Z 2022-11-23T02:12:26.6739224Z Generating XML reports... 2022-11-23T02:12:26.6739657Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015537.xml 2022-11-23T02:12:26.6740091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6740268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6740655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6740847Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6740867Z 2022-11-23T02:12:26.6740973Z Running tests... 2022-11-23T02:12:26.6741234Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6741548Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6741813Z test_barrier_group_cuda (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6742018Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 26784 2022-11-23T02:12:26.6742244Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 26785 2022-11-23T02:12:26.6742624Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6742798Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6743178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6743368Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6743732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6743906Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6744267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6744460Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6744710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6744960Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6745367Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6745765Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6745994Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6746222Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6746380Z skip: Skipped due to small world size. (4.233s) 2022-11-23T02:12:26.6746403Z 2022-11-23T02:12:26.6746653Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6746762Z Ran 1 test in 4.233s 2022-11-23T02:12:26.6746782Z 2022-11-23T02:12:26.6746886Z OK (skipped=1) 2022-11-23T02:12:26.6746905Z 2022-11-23T02:12:26.6747077Z Generating XML reports... 2022-11-23T02:12:26.6747537Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015540.xml 2022-11-23T02:12:26.6747913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6748090Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6748475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6748670Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6748689Z 2022-11-23T02:12:26.6748831Z Running tests... 2022-11-23T02:12:26.6749420Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6749754Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6750045Z test_barrier_timeout_full_group (__main__.TestDistBackendWithSpawn) ... skip: Only gloo backend supports timeouts (0.002s) 2022-11-23T02:12:26.6750065Z 2022-11-23T02:12:26.6750325Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6750435Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6750455Z 2022-11-23T02:12:26.6750559Z OK (skipped=1) 2022-11-23T02:12:26.6750578Z 2022-11-23T02:12:26.6750699Z Generating XML reports... 2022-11-23T02:12:26.6751146Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015547.xml 2022-11-23T02:12:26.6751507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6751687Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6752070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6752262Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6752281Z 2022-11-23T02:12:26.6752385Z Running tests... 2022-11-23T02:12:26.6752653Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6752969Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6753244Z test_barrier_timeout_global (__main__.TestDistBackendWithSpawn) ... skip: Only gloo backend supports timeouts (0.002s) 2022-11-23T02:12:26.6753264Z 2022-11-23T02:12:26.6753524Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6753617Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6753636Z 2022-11-23T02:12:26.6753744Z OK (skipped=1) 2022-11-23T02:12:26.6753763Z 2022-11-23T02:12:26.6753884Z Generating XML reports... 2022-11-23T02:12:26.6754333Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015549.xml 2022-11-23T02:12:26.6754707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6754885Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6755270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6755465Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6755484Z 2022-11-23T02:12:26.6755573Z Running tests... 2022-11-23T02:12:26.6755834Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6756146Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6756421Z test_barrier_timeout_group (__main__.TestDistBackendWithSpawn) ... skip: Only gloo backend supports timeouts (0.002s) 2022-11-23T02:12:26.6756440Z 2022-11-23T02:12:26.6756697Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6756895Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6756917Z 2022-11-23T02:12:26.6757031Z OK (skipped=1) 2022-11-23T02:12:26.6757050Z 2022-11-23T02:12:26.6757171Z Generating XML reports... 2022-11-23T02:12:26.6757622Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015551.xml 2022-11-23T02:12:26.6757979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6758154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6758538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6758797Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6758816Z 2022-11-23T02:12:26.6758926Z Running tests... 2022-11-23T02:12:26.6759189Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6759508Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6759771Z test_batch_isend_irecv_gloo (__main__.TestDistBackendWithSpawn) ... skip: GLOO Batch Send Recv CPU (0.002s) 2022-11-23T02:12:26.6759790Z 2022-11-23T02:12:26.6760048Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6760140Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6760160Z 2022-11-23T02:12:26.6760264Z OK (skipped=1) 2022-11-23T02:12:26.6760283Z 2022-11-23T02:12:26.6760403Z Generating XML reports... 2022-11-23T02:12:26.6760851Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015554.xml 2022-11-23T02:12:26.6761228Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6761402Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6761785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6761977Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6761996Z 2022-11-23T02:12:26.6762086Z Running tests... 2022-11-23T02:12:26.6762349Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6762660Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6762922Z test_batch_isend_irecv_gloo_tags (__main__.TestDistBackendWithSpawn) ... skip: GLOO Batch Send Recv CPU (0.002s) 2022-11-23T02:12:26.6762941Z 2022-11-23T02:12:26.6763203Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6763311Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6763330Z 2022-11-23T02:12:26.6763435Z OK (skipped=1) 2022-11-23T02:12:26.6763454Z 2022-11-23T02:12:26.6763574Z Generating XML reports... 2022-11-23T02:12:26.6764021Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015556.xml 2022-11-23T02:12:26.6764380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6764554Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6764936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6765126Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6765145Z 2022-11-23T02:12:26.6765252Z Running tests... 2022-11-23T02:12:26.6765519Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6765831Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6766155Z test_batch_isend_irecv_mixed_backend_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T02:12:26.6766176Z 2022-11-23T02:12:26.6766443Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6766534Z Ran 1 test in 0.003s 2022-11-23T02:12:26.6766553Z 2022-11-23T02:12:26.6766659Z OK (skipped=1) 2022-11-23T02:12:26.6766678Z 2022-11-23T02:12:26.6766802Z Generating XML reports... 2022-11-23T02:12:26.6767249Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015559.xml 2022-11-23T02:12:26.6767623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6767852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6768239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6768431Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6768453Z 2022-11-23T02:12:26.6768558Z Running tests... 2022-11-23T02:12:26.6768804Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6769110Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6769369Z test_batch_isend_irecv_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.003s) 2022-11-23T02:12:26.6769388Z 2022-11-23T02:12:26.6769672Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6769781Z Ran 1 test in 0.003s 2022-11-23T02:12:26.6769800Z 2022-11-23T02:12:26.6769904Z OK (skipped=1) 2022-11-23T02:12:26.6769927Z 2022-11-23T02:12:26.6770046Z Generating XML reports... 2022-11-23T02:12:26.6770491Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015601.xml 2022-11-23T02:12:26.6770851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6771026Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6771408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6771598Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6771617Z 2022-11-23T02:12:26.6771721Z Running tests... 2022-11-23T02:12:26.6771982Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6772293Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6772567Z test_batch_isend_irecv_no_rank_zero_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.003s) 2022-11-23T02:12:26.6772586Z 2022-11-23T02:12:26.6772841Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6772934Z Ran 1 test in 0.003s 2022-11-23T02:12:26.6772956Z 2022-11-23T02:12:26.6773058Z OK (skipped=1) 2022-11-23T02:12:26.6773077Z 2022-11-23T02:12:26.6773197Z Generating XML reports... 2022-11-23T02:12:26.6773643Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015603.xml 2022-11-23T02:12:26.6774015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6774190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6774570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6774762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6774781Z 2022-11-23T02:12:26.6774886Z Running tests... 2022-11-23T02:12:26.6775132Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6775551Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6775827Z test_batch_isend_irecv_op_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T02:12:26.6775846Z 2022-11-23T02:12:26.6776111Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6776222Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6776241Z 2022-11-23T02:12:26.6776350Z OK (skipped=1) 2022-11-23T02:12:26.6776369Z 2022-11-23T02:12:26.6776493Z Generating XML reports... 2022-11-23T02:12:26.6776940Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015606.xml 2022-11-23T02:12:26.6777349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6777527Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6777916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6778108Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6778127Z 2022-11-23T02:12:26.6778235Z Running tests... 2022-11-23T02:12:26.6778496Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6778811Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6779079Z test_batch_isend_irecv_op_list_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T02:12:26.6779099Z 2022-11-23T02:12:26.6779358Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6779454Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6779473Z 2022-11-23T02:12:26.6779579Z OK (skipped=1) 2022-11-23T02:12:26.6779598Z 2022-11-23T02:12:26.6779718Z Generating XML reports... 2022-11-23T02:12:26.6780165Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015608.xml 2022-11-23T02:12:26.6780538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6780713Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6781093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6781284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6781303Z 2022-11-23T02:12:26.6781409Z Running tests... 2022-11-23T02:12:26.6781654Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6781967Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6782247Z test_batch_isend_irecv_ring_exchange_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T02:12:26.6782269Z 2022-11-23T02:12:26.6782527Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6782635Z Ran 1 test in 0.003s 2022-11-23T02:12:26.6782655Z 2022-11-23T02:12:26.6782759Z OK (skipped=1) 2022-11-23T02:12:26.6782778Z 2022-11-23T02:12:26.6782899Z Generating XML reports... 2022-11-23T02:12:26.6783349Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015611.xml 2022-11-23T02:12:26.6783724Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6783883Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6784268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6784461Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6784480Z 2022-11-23T02:12:26.6784634Z Running tests... 2022-11-23T02:12:26.6784905Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6785217Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6785484Z test_batch_isend_irecv_self_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T02:12:26.6785504Z 2022-11-23T02:12:26.6785763Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6785855Z Ran 1 test in 0.003s 2022-11-23T02:12:26.6785890Z 2022-11-23T02:12:26.6785979Z OK (skipped=1) 2022-11-23T02:12:26.6785998Z 2022-11-23T02:12:26.6786177Z Generating XML reports... 2022-11-23T02:12:26.6786628Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015613.xml 2022-11-23T02:12:26.6787005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6787182Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6787563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6787754Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6787773Z 2022-11-23T02:12:26.6787879Z Running tests... 2022-11-23T02:12:26.6788121Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6788431Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6788695Z test_batch_isend_irecv_tensor_err (__main__.TestDistBackendWithSpawn) ... skip: NCCL Batch Send Recv Only (0.002s) 2022-11-23T02:12:26.6788718Z 2022-11-23T02:12:26.6789201Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6789322Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6789341Z 2022-11-23T02:12:26.6789454Z OK (skipped=1) 2022-11-23T02:12:26.6789474Z 2022-11-23T02:12:26.6789627Z Generating XML reports... 2022-11-23T02:12:26.6790078Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015615.xml 2022-11-23T02:12:26.6790452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6790610Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6790991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6791181Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6791204Z 2022-11-23T02:12:26.6791309Z Running tests... 2022-11-23T02:12:26.6791571Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6791887Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6792137Z test_broadcast (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6792359Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27316 2022-11-23T02:12:26.6792561Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27317 2022-11-23T02:12:26.6792932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6793105Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6793481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6793675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6794040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6794292Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6794687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6794878Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6795110Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6795360Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6795767Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6796246Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6796479Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6796826Z STAGE:2022-11-23 01:56:22 27316:27316 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6797056Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6797386Z STAGE:2022-11-23 01:56:22 27317:27317 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6797665Z [1669168582.247686] [08317a7e7676:27317:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6797879Z [1669168583.913511] [08317a7e7676:27317:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6798119Z [1669168583.913511] [08317a7e7676:27317:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6798395Z [1669168582.226945] [08317a7e7676:27316:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6798623Z [1669168583.889127] [08317a7e7676:27316:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6798858Z [1669168583.889127] [08317a7e7676:27316:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6799419Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6799439Z 2022-11-23T02:12:26.6799793Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6800150Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6800484Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6800828Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6801158Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6801489Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6801811Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6802144Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6802492Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6802826Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6803262Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6803834Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6803853Z 2022-11-23T02:12:26.6804198Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6804526Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6804830Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6805220Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6805549Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6805902Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6806248Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6806571Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6806894Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6807225Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6807557Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6807889Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6808233Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6808562Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6808884Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6809218Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6809546Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6809892Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6810237Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6810568Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6810879Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6811214Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6811543Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6811888Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6812230Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6812555Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6812875Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6813210Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6813535Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6813915Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6814270Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6814598Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6814923Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6815261Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6815587Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6815983Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6816333Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6816663Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6816966Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6817299Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6817629Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6817971Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6818319Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6818643Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6818970Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6819306Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6819632Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6819958Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6820300Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6820624Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6820953Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6821286Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6821616Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6821961Z STAGE:2022-11-23 01:56:24 27316:27316 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6822304Z STAGE:2022-11-23 01:56:24 27317:27317 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6822405Z ok (6.707s) 2022-11-23T02:12:26.6822425Z 2022-11-23T02:12:26.6822675Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6822785Z Ran 1 test in 6.707s 2022-11-23T02:12:26.6822805Z 2022-11-23T02:12:26.6822894Z OK 2022-11-23T02:12:26.6822912Z 2022-11-23T02:12:26.6823038Z Generating XML reports... 2022-11-23T02:12:26.6823493Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015618.xml 2022-11-23T02:12:26.6823869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6824093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6824490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6824668Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6824707Z 2022-11-23T02:12:26.6824797Z Running tests... 2022-11-23T02:12:26.6825058Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6825370Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6825657Z test_broadcast_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo and Nccl backend supports CUDA allReduce (0.002s) 2022-11-23T02:12:26.6825723Z 2022-11-23T02:12:26.6825990Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6826105Z Ran 1 test in 0.002s 2022-11-23T02:12:26.6826124Z 2022-11-23T02:12:26.6826233Z OK (skipped=1) 2022-11-23T02:12:26.6826252Z 2022-11-23T02:12:26.6826376Z Generating XML reports... 2022-11-23T02:12:26.6826811Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015627.xml 2022-11-23T02:12:26.6827187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6827361Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6827746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6827938Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6827961Z 2022-11-23T02:12:26.6828067Z Running tests... 2022-11-23T02:12:26.6828332Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6828649Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6828915Z test_broadcast_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6829414Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27463 2022-11-23T02:12:26.6829640Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27464 2022-11-23T02:12:26.6830024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6830202Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6830585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6830784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6831152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6831326Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6831687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6831877Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6832125Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6832372Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6832777Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6833181Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6833413Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6833735Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.6833974Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6834197Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.6834607Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6834946Z STAGE:2022-11-23 01:56:33 27464:27464 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6835346Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.6835749Z STAGE:2022-11-23 01:56:33 27463:27463 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6836029Z [1669168593.853155] [08317a7e7676:27464:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6836265Z [1669168595.526522] [08317a7e7676:27464:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6836510Z [1669168595.526522] [08317a7e7676:27464:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6836786Z [1669168593.830253] [08317a7e7676:27463:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6837019Z [1669168595.482672] [08317a7e7676:27463:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6837242Z [1669168595.482672] [08317a7e7676:27463:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6837805Z STAGE:2022-11-23 01:56:35 27464:27464 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 01:56:35 27463:27463 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6837826Z 2022-11-23T02:12:26.6838181Z STAGE:2022-11-23 01:56:35 27464:27464 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6838530Z STAGE:2022-11-23 01:56:35 27463:27463 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6838864Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6839184Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6839520Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6839848Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6840197Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6840542Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6840854Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6841175Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6841507Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6841834Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6842184Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6842525Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6842899Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6843234Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6843566Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6843883Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6844227Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6844573Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6844956Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6845283Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6845621Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6845946Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6846289Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6846634Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6846942Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6847267Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6847606Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6847936Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6848279Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6848624Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6848949Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6849269Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6849600Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6849914Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6850259Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6850603Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6850929Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6851251Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6851585Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6851912Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6852255Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6852601Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6852913Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6853282Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6853625Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6853962Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6854310Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6854658Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6854988Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6855368Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6855699Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6856016Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6856362Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6856707Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6857032Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6857359Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6857689Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6858024Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6858371Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6858714Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6859023Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6859343Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.6859674Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6860001Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.6860343Z STAGE:2022-11-23 01:56:36 27463:27463 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6860692Z STAGE:2022-11-23 01:56:36 27464:27464 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.6860792Z ok (6.667s) 2022-11-23T02:12:26.6860812Z 2022-11-23T02:12:26.6861081Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6861176Z Ran 1 test in 6.667s 2022-11-23T02:12:26.6861212Z 2022-11-23T02:12:26.6861285Z OK 2022-11-23T02:12:26.6861303Z 2022-11-23T02:12:26.6861424Z Generating XML reports... 2022-11-23T02:12:26.6861883Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015629.xml 2022-11-23T02:12:26.6862257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6862434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6862825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6863016Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6863036Z 2022-11-23T02:12:26.6863143Z Running tests... 2022-11-23T02:12:26.6863441Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6863767Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6864027Z test_broadcast_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6864251Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27577 2022-11-23T02:12:26.6864475Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27578 2022-11-23T02:12:26.6864848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6865109Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6865825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6866107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6866496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6866674Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6867054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6867244Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6867494Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6867745Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6868152Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6868553Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6868770Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6869226Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6869398Z skip: Skipped due to small world size. (4.235s) 2022-11-23T02:12:26.6869420Z 2022-11-23T02:12:26.6869695Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6869808Z Ran 1 test in 4.235s 2022-11-23T02:12:26.6869828Z 2022-11-23T02:12:26.6869939Z OK (skipped=1) 2022-11-23T02:12:26.6869958Z 2022-11-23T02:12:26.6870085Z Generating XML reports... 2022-11-23T02:12:26.6870537Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015639.xml 2022-11-23T02:12:26.6870914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6871077Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6871464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6871655Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6871675Z 2022-11-23T02:12:26.6871781Z Running tests... 2022-11-23T02:12:26.6872044Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6872358Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6872621Z test_broadcast_multigpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6872845Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27680 2022-11-23T02:12:26.6873048Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27681 2022-11-23T02:12:26.6873514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6873703Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6874092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6874284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6874656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6874833Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6875214Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6875468Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6875702Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6875952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6876357Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6876757Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6876989Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6877220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6878017Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1402: UserWarning: torch.distributed.broadcast_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T02:12:26.6878132Z warnings.warn( 2022-11-23T02:12:26.6878908Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1402: UserWarning: torch.distributed.broadcast_multigpu will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#multi-gpu-collective-functions 2022-11-23T02:12:26.6879021Z warnings.warn( 2022-11-23T02:12:26.6879279Z [1669168611.191064] [08317a7e7676:27680:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6879511Z [1669168611.204276] [08317a7e7676:27680:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6879755Z [1669168611.204276] [08317a7e7676:27680:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6880031Z [1669168611.200991] [08317a7e7676:27681:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6880259Z [1669168611.214310] [08317a7e7676:27681:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6880492Z [1669168611.214310] [08317a7e7676:27681:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6880595Z ok (6.190s) 2022-11-23T02:12:26.6880615Z 2022-11-23T02:12:26.6880889Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6880998Z Ran 1 test in 6.191s 2022-11-23T02:12:26.6881017Z 2022-11-23T02:12:26.6881091Z OK 2022-11-23T02:12:26.6881113Z 2022-11-23T02:12:26.6881236Z Generating XML reports... 2022-11-23T02:12:26.6881692Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015645.xml 2022-11-23T02:12:26.6882120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6882305Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6882698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6882895Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6882914Z 2022-11-23T02:12:26.6883026Z Running tests... 2022-11-23T02:12:26.6883289Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6883587Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6883914Z test_broadcast_object_list (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6884682Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82847 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.606s) 2022-11-23T02:12:26.6884703Z 2022-11-23T02:12:26.6884962Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6885074Z Ran 1 test in 1.606s 2022-11-23T02:12:26.6885092Z 2022-11-23T02:12:26.6885200Z OK (skipped=1) 2022-11-23T02:12:26.6885219Z 2022-11-23T02:12:26.6885344Z Generating XML reports... 2022-11-23T02:12:26.6885791Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015654.xml 2022-11-23T02:12:26.6886162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6886325Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6886706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6886901Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6886921Z 2022-11-23T02:12:26.6887027Z Running tests... 2022-11-23T02:12:26.6887286Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6887600Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6887918Z test_compute_bucket_assignment_by_size_sparse_error_with_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6888667Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/85012 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.617s) 2022-11-23T02:12:26.6888691Z 2022-11-23T02:12:26.6888954Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6889066Z Ran 1 test in 1.617s 2022-11-23T02:12:26.6889086Z 2022-11-23T02:12:26.6889175Z OK (skipped=1) 2022-11-23T02:12:26.6889208Z 2022-11-23T02:12:26.6889313Z Generating XML reports... 2022-11-23T02:12:26.6889808Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015658.xml 2022-11-23T02:12:26.6890188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6890366Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6890750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6890946Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6890966Z 2022-11-23T02:12:26.6891073Z Running tests... 2022-11-23T02:12:26.6891335Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6891685Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6892014Z test_compute_bucket_assignment_by_size_sparse_error_without_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6892766Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/85339 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.655s) 2022-11-23T02:12:26.6892787Z 2022-11-23T02:12:26.6893049Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6893218Z Ran 1 test in 1.655s 2022-11-23T02:12:26.6893237Z 2022-11-23T02:12:26.6893346Z OK (skipped=1) 2022-11-23T02:12:26.6893365Z 2022-11-23T02:12:26.6893490Z Generating XML reports... 2022-11-23T02:12:26.6893946Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015702.xml 2022-11-23T02:12:26.6894322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6894501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6894868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6895062Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6895081Z 2022-11-23T02:12:26.6895187Z Running tests... 2022-11-23T02:12:26.6895453Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6895766Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6896033Z test_ddp_broadcast_buffer (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6896258Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 27896 2022-11-23T02:12:26.6896479Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 27897 2022-11-23T02:12:26.6896836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6897011Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6897395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6897585Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6897960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6898133Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6898515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6898705Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6898956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6899185Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6899589Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6899989Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6900224Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6900453Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6900761Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcummqk6p 2022-11-23T02:12:26.6901049Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcummqk6p/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6901309Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6lm9zqce 2022-11-23T02:12:26.6901562Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6lm9zqce/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6901840Z [1669168632.332468] [08317a7e7676:27896:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6902076Z [1669168632.346141] [08317a7e7676:27896:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6902364Z [1669168632.346141] [08317a7e7676:27896:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6902609Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6902883Z [1669168632.338119] [08317a7e7676:27897:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6903113Z [1669168632.351529] [08317a7e7676:27897:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6903348Z [1669168632.351529] [08317a7e7676:27897:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6903587Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6903690Z ok (6.663s) 2022-11-23T02:12:26.6903710Z 2022-11-23T02:12:26.6903967Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6904079Z Ran 1 test in 6.663s 2022-11-23T02:12:26.6904099Z 2022-11-23T02:12:26.6904191Z OK 2022-11-23T02:12:26.6904210Z 2022-11-23T02:12:26.6904339Z Generating XML reports... 2022-11-23T02:12:26.6904788Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015707.xml 2022-11-23T02:12:26.6905163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6905343Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6905728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6905923Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6905946Z 2022-11-23T02:12:26.6906038Z Running tests... 2022-11-23T02:12:26.6906303Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6906614Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6906896Z test_ddp_broadcast_buffer_via_hook (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6907116Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28014 2022-11-23T02:12:26.6907332Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28015 2022-11-23T02:12:26.6907707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6907883Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6908250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6908448Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6908816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6909280Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6909690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6909883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6910136Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6910384Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6910791Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6911240Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6911475Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6911713Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6911978Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph4kbs99p 2022-11-23T02:12:26.6912251Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph4kbs99p/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6912507Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1isqnuhx 2022-11-23T02:12:26.6912776Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1isqnuhx/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6913051Z [1669168641.586333] [08317a7e7676:28014:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6913288Z [1669168641.599982] [08317a7e7676:28014:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6913511Z [1669168641.599982] [08317a7e7676:28014:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6913785Z [1669168641.593542] [08317a7e7676:28015:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6914012Z [1669168641.607063] [08317a7e7676:28015:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6914245Z [1669168641.607063] [08317a7e7676:28015:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6914483Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6914719Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6914960Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6915195Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6915296Z ok (6.752s) 2022-11-23T02:12:26.6915317Z 2022-11-23T02:12:26.6915570Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6915682Z Ran 1 test in 6.752s 2022-11-23T02:12:26.6915701Z 2022-11-23T02:12:26.6915794Z OK 2022-11-23T02:12:26.6915814Z 2022-11-23T02:12:26.6915939Z Generating XML reports... 2022-11-23T02:12:26.6916392Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015716.xml 2022-11-23T02:12:26.6916770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6916946Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6917333Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6917527Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6917547Z 2022-11-23T02:12:26.6917688Z Running tests... 2022-11-23T02:12:26.6917959Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6918277Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6918556Z test_ddp_buffer_hook_allreduce (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6919308Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78641 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.638s) 2022-11-23T02:12:26.6919374Z 2022-11-23T02:12:26.6919646Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6919759Z Ran 1 test in 1.638s 2022-11-23T02:12:26.6919779Z 2022-11-23T02:12:26.6919886Z OK (skipped=1) 2022-11-23T02:12:26.6919905Z 2022-11-23T02:12:26.6920032Z Generating XML reports... 2022-11-23T02:12:26.6920462Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015725.xml 2022-11-23T02:12:26.6920840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6921017Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6921401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6921592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6921615Z 2022-11-23T02:12:26.6921721Z Running tests... 2022-11-23T02:12:26.6921982Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6922295Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6922591Z test_ddp_buffer_hook_allreduce_return_future (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6923330Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77261 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.649s) 2022-11-23T02:12:26.6923367Z 2022-11-23T02:12:26.6923612Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6923721Z Ran 1 test in 1.649s 2022-11-23T02:12:26.6923739Z 2022-11-23T02:12:26.6923848Z OK (skipped=1) 2022-11-23T02:12:26.6923867Z 2022-11-23T02:12:26.6923991Z Generating XML reports... 2022-11-23T02:12:26.6924441Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015729.xml 2022-11-23T02:12:26.6924819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6925001Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6925384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6925576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6925595Z 2022-11-23T02:12:26.6925685Z Running tests... 2022-11-23T02:12:26.6925946Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6926259Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6926553Z test_ddp_build_debug_param_to_name_mapping (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6926775Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28200 2022-11-23T02:12:26.6927042Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28201 2022-11-23T02:12:26.6927431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6927701Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6928238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6928443Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6928818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6929075Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6929457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6929651Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6929902Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6930149Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6930553Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6930936Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6931168Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6931402Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6931661Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6o3kskly 2022-11-23T02:12:26.6931936Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6o3kskly/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6932191Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbh_lze9_ 2022-11-23T02:12:26.6932460Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbh_lze9_/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6932670Z 2022-11-23T02:12:26.6932930Z [1669168659.194205] [08317a7e7676:28200:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6933162Z [1669168659.208238] [08317a7e7676:28200:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6933401Z [1669168659.208238] [08317a7e7676:28200:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6933678Z [1669168659.201149] [08317a7e7676:28201:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6933906Z [1669168659.214768] [08317a7e7676:28201:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6934143Z [1669168659.214768] [08317a7e7676:28201:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6934246Z ok (6.148s) 2022-11-23T02:12:26.6934267Z 2022-11-23T02:12:26.6934535Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6934646Z Ran 1 test in 6.148s 2022-11-23T02:12:26.6934665Z 2022-11-23T02:12:26.6934756Z OK 2022-11-23T02:12:26.6934775Z 2022-11-23T02:12:26.6934884Z Generating XML reports... 2022-11-23T02:12:26.6935336Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015733.xml 2022-11-23T02:12:26.6935710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6935934Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6936327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6936521Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6936541Z 2022-11-23T02:12:26.6936648Z Running tests... 2022-11-23T02:12:26.6936910Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6937205Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6937514Z test_ddp_build_debug_param_to_name_mapping_requires_grad (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6937790Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28314 2022-11-23T02:12:26.6938010Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28315 2022-11-23T02:12:26.6938394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6938573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6938953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6939147Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6939516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6939672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6940052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6940243Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6940492Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6940737Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6941142Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6941542Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6941773Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6942006Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6942252Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp338kjhvm 2022-11-23T02:12:26.6942523Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp338kjhvm/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6942785Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd2aphh7f 2022-11-23T02:12:26.6943059Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd2aphh7f/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6943333Z [1669168668.032857] [08317a7e7676:28314:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6943562Z [1669168668.046483] [08317a7e7676:28314:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6943800Z [1669168668.046483] [08317a7e7676:28314:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6944074Z [1669168668.038012] [08317a7e7676:28315:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6944348Z [1669168668.051607] [08317a7e7676:28315:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6944595Z [1669168668.051607] [08317a7e7676:28315:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6944680Z ok (6.284s) 2022-11-23T02:12:26.6944699Z 2022-11-23T02:12:26.6944968Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6945080Z Ran 1 test in 6.284s 2022-11-23T02:12:26.6945100Z 2022-11-23T02:12:26.6945193Z OK 2022-11-23T02:12:26.6945212Z 2022-11-23T02:12:26.6945335Z Generating XML reports... 2022-11-23T02:12:26.6945786Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015742.xml 2022-11-23T02:12:26.6946265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6946442Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6946815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6947008Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6947028Z 2022-11-23T02:12:26.6947133Z Running tests... 2022-11-23T02:12:26.6947395Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6947706Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6947973Z test_ddp_comm_hook_logging (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6948194Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28428 2022-11-23T02:12:26.6948415Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28429 2022-11-23T02:12:26.6948775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6949207Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6949614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6949807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6950174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6950345Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6950719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6950909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6951158Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6951391Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6951798Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6952199Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6952429Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6952656Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6952912Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp51v044wq 2022-11-23T02:12:26.6953184Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp51v044wq/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6953438Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphdmu5t2l 2022-11-23T02:12:26.6953781Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphdmu5t2l/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6954011Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6954243Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6954522Z [1669168676.769604] [08317a7e7676:28428:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6954753Z [1669168676.783837] [08317a7e7676:28428:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6954988Z [1669168676.783837] [08317a7e7676:28428:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6955323Z [1669168676.771926] [08317a7e7676:28429:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6955556Z [1669168676.785482] [08317a7e7676:28429:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6955795Z [1669168676.785482] [08317a7e7676:28429:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6955899Z ok (6.637s) 2022-11-23T02:12:26.6955918Z 2022-11-23T02:12:26.6956191Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6956285Z Ran 1 test in 6.637s 2022-11-23T02:12:26.6956304Z 2022-11-23T02:12:26.6956398Z OK 2022-11-23T02:12:26.6956417Z 2022-11-23T02:12:26.6956544Z Generating XML reports... 2022-11-23T02:12:26.6956996Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015751.xml 2022-11-23T02:12:26.6957365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6957546Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6957932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6958123Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6958142Z 2022-11-23T02:12:26.6958232Z Running tests... 2022-11-23T02:12:26.6958494Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6958807Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6959102Z test_ddp_control_flow_different_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6959326Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28546 2022-11-23T02:12:26.6959544Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28547 2022-11-23T02:12:26.6959921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6960096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6960477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6960651Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6961018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6961190Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6961568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6961753Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6962051Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6962308Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6962711Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6963096Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6963331Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6963566Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6963876Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9z_4brr1 2022-11-23T02:12:26.6964149Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9z_4brr1/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6964414Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwl12i724 2022-11-23T02:12:26.6964683Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwl12i724/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6964958Z [1669168686.040271] [08317a7e7676:28547:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6965191Z [1669168686.053605] [08317a7e7676:28547:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6965428Z [1669168686.053605] [08317a7e7676:28547:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6965686Z [1669168686.038266] [08317a7e7676:28546:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6965918Z [1669168686.052294] [08317a7e7676:28546:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6966148Z [1669168686.052294] [08317a7e7676:28546:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6966944Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:12:26.6967050Z ok (6.703s) 2022-11-23T02:12:26.6967069Z 2022-11-23T02:12:26.6967342Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6967454Z Ran 1 test in 6.703s 2022-11-23T02:12:26.6967473Z 2022-11-23T02:12:26.6967566Z OK 2022-11-23T02:12:26.6967588Z 2022-11-23T02:12:26.6967716Z Generating XML reports... 2022-11-23T02:12:26.6968169Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015800.xml 2022-11-23T02:12:26.6968544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6968704Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6969083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6969272Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6969294Z 2022-11-23T02:12:26.6969401Z Running tests... 2022-11-23T02:12:26.6969664Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6969979Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6970307Z test_ddp_control_flow_same_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6971078Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78235 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.649s) 2022-11-23T02:12:26.6971098Z 2022-11-23T02:12:26.6971361Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6971453Z Ran 1 test in 1.649s 2022-11-23T02:12:26.6971472Z 2022-11-23T02:12:26.6971633Z OK (skipped=1) 2022-11-23T02:12:26.6971653Z 2022-11-23T02:12:26.6971776Z Generating XML reports... 2022-11-23T02:12:26.6972223Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015809.xml 2022-11-23T02:12:26.6972605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6972783Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6973163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6973356Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6973376Z 2022-11-23T02:12:26.6973481Z Running tests... 2022-11-23T02:12:26.6973728Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6974042Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6974307Z test_ddp_create_graph (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6974526Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28698 2022-11-23T02:12:26.6974748Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28699 2022-11-23T02:12:26.6975124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6975298Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6975679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6975852Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6976223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6976399Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6976778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6976968Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6977223Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.6977473Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.6977876Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6978277Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.6978493Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.6978759Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph8qvw2ct 2022-11-23T02:12:26.6979034Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph8qvw2ct/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6979311Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.6979577Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeh8m48ie 2022-11-23T02:12:26.6979846Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeh8m48ie/_remote_module_non_scriptable.py 2022-11-23T02:12:26.6980125Z [1669168698.036151] [08317a7e7676:28698:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6980355Z [1669168699.451725] [08317a7e7676:28698:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6980591Z [1669168699.451725] [08317a7e7676:28698:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6980895Z [1669168698.038576] [08317a7e7676:28699:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.6981130Z [1669168699.470997] [08317a7e7676:28699:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.6981366Z [1669168699.470997] [08317a7e7676:28699:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.6982286Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:12:26.6983194Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:12:26.6984380Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Using backward() with create_graph=True will create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using autograd.grad when creating the graph to avoid this. If you have to use this function, make sure to reset the .grad fields of your parameters to None after use to break the cycle and avoid the leak. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/engine.cpp:1127.) 2022-11-23T02:12:26.6984616Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:12:26.6985798Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Using backward() with create_graph=True will create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using autograd.grad when creating the graph to avoid this. If you have to use this function, make sure to reset the .grad fields of your parameters to None after use to break the cycle and avoid the leak. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/engine.cpp:1127.) 2022-11-23T02:12:26.6986032Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:12:26.6986274Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6986512Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.6987452Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:12:26.6988358Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:12:26.6989466Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:12:26.6990500Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:12:26.6991396Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:12:26.6992273Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:12:26.6993161Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:12:26.6994046Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:12:26.6994932Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:12:26.6995806Z [W reducer.cpp:380] Using DistributedDataParallel with create_graph=True is not well-supported. The higher-order gradient will not be synchronized across ranks, and backpropagation through all_reduce operations will not occur. If you require DDP to work with higher-order gradients for your use case, please ping https://github.com/pytorch/pytorch/issues/63929 2022-11-23T02:12:26.6995908Z ok (6.154s) 2022-11-23T02:12:26.6995928Z 2022-11-23T02:12:26.6996193Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6996304Z Ran 1 test in 6.154s 2022-11-23T02:12:26.6996326Z 2022-11-23T02:12:26.6996414Z OK 2022-11-23T02:12:26.6996434Z 2022-11-23T02:12:26.6996556Z Generating XML reports... 2022-11-23T02:12:26.6997007Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015814.xml 2022-11-23T02:12:26.6997428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.6997617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.6998006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.6998202Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.6998221Z 2022-11-23T02:12:26.6998330Z Running tests... 2022-11-23T02:12:26.6998596Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.6998910Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.6999212Z test_ddp_device (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.6999976Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77324 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.616s) 2022-11-23T02:12:26.6999998Z 2022-11-23T02:12:26.7000244Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7000355Z Ran 1 test in 1.617s 2022-11-23T02:12:26.7000375Z 2022-11-23T02:12:26.7000479Z OK (skipped=1) 2022-11-23T02:12:26.7000498Z 2022-11-23T02:12:26.7000621Z Generating XML reports... 2022-11-23T02:12:26.7001069Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015822.xml 2022-11-23T02:12:26.7001448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7001628Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7002016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7002207Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7002226Z 2022-11-23T02:12:26.7002316Z Running tests... 2022-11-23T02:12:26.7002578Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7002887Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7003159Z test_ddp_forward_backward_hook (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7003378Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 28846 2022-11-23T02:12:26.7003602Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 28847 2022-11-23T02:12:26.7003978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7004155Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7004524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7006891Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7007275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7007451Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7007834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7008027Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7008279Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7008580Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7008997Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7009401Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7009616Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7009848Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7010109Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxffk0roj 2022-11-23T02:12:26.7010437Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxffk0roj/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7010693Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvrulew17 2022-11-23T02:12:26.7010967Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvrulew17/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7011777Z /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1319: UserWarning: Using a non-full backward hook when the forward contains multiple autograd Nodes is deprecated and will be removed in future versions. This hook will be missing some grad_input. Please use register_full_backward_hook to get the documented behavior. 2022-11-23T02:12:26.7012111Z warnings.warn("Using a non-full backward hook when the forward contains multiple autograd Nodes " 2022-11-23T02:12:26.7012905Z /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1319: UserWarning: Using a non-full backward hook when the forward contains multiple autograd Nodes is deprecated and will be removed in future versions. This hook will be missing some grad_input. Please use register_full_backward_hook to get the documented behavior. 2022-11-23T02:12:26.7013241Z warnings.warn("Using a non-full backward hook when the forward contains multiple autograd Nodes " 2022-11-23T02:12:26.7013508Z [1669168712.226804] [08317a7e7676:28847:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7013741Z [1669168712.240241] [08317a7e7676:28847:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7013979Z [1669168712.240241] [08317a7e7676:28847:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7014250Z [1669168712.224567] [08317a7e7676:28846:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7014478Z [1669168712.236970] [08317a7e7676:28846:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7014711Z [1669168712.236970] [08317a7e7676:28846:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7014814Z ok (6.649s) 2022-11-23T02:12:26.7014834Z 2022-11-23T02:12:26.7015104Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7015214Z Ran 1 test in 6.649s 2022-11-23T02:12:26.7015233Z 2022-11-23T02:12:26.7015322Z OK 2022-11-23T02:12:26.7015341Z 2022-11-23T02:12:26.7015446Z Generating XML reports... 2022-11-23T02:12:26.7015900Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015826.xml 2022-11-23T02:12:26.7016277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7016459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7016842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7017035Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7017102Z 2022-11-23T02:12:26.7017215Z Running tests... 2022-11-23T02:12:26.7017479Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7017776Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7018053Z test_ddp_grad_div_uneven_inputs (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7018809Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78685 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.637s) 2022-11-23T02:12:26.7018873Z 2022-11-23T02:12:26.7019138Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7019250Z Ran 1 test in 1.637s 2022-11-23T02:12:26.7019269Z 2022-11-23T02:12:26.7019377Z OK (skipped=1) 2022-11-23T02:12:26.7019399Z 2022-11-23T02:12:26.7019522Z Generating XML reports... 2022-11-23T02:12:26.7019968Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015836.xml 2022-11-23T02:12:26.7020346Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7020523Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7020887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7021080Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7021103Z 2022-11-23T02:12:26.7021209Z Running tests... 2022-11-23T02:12:26.7021472Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7021788Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7022064Z test_ddp_hook_parity_allreduce (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7022812Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77293 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.594s) 2022-11-23T02:12:26.7022833Z 2022-11-23T02:12:26.7023094Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7023204Z Ran 1 test in 1.594s 2022-11-23T02:12:26.7023223Z 2022-11-23T02:12:26.7023331Z OK (skipped=1) 2022-11-23T02:12:26.7023350Z 2022-11-23T02:12:26.7023455Z Generating XML reports... 2022-11-23T02:12:26.7023901Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015840.xml 2022-11-23T02:12:26.7024276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7024452Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7024837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7025029Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7025047Z 2022-11-23T02:12:26.7025157Z Running tests... 2022-11-23T02:12:26.7025422Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7025716Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7026017Z test_ddp_hook_parity_allreduce_process_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7026238Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29062 2022-11-23T02:12:26.7026507Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29063 2022-11-23T02:12:26.7026888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7027066Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7027450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7027645Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7028012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7028219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7028605Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7028799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7029280Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7029539Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7029955Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7030357Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7030593Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7030849Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.7031061Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7031310Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.7031713Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7032112Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7032370Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpju6vszh7 2022-11-23T02:12:26.7032629Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxtmb8u_0 2022-11-23T02:12:26.7032901Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpju6vszh7/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7033173Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxtmb8u_0/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7033454Z [1669168729.672798] [08317a7e7676:29063:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7033672Z [1669168729.686041] [08317a7e7676:29063:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7033911Z [1669168729.686041] [08317a7e7676:29063:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7034187Z [1669168729.663126] [08317a7e7676:29062:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7034415Z [1669168729.676899] [08317a7e7676:29062:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7034648Z [1669168729.676899] [08317a7e7676:29062:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7034888Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7035201Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7035451Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7035686Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7035769Z ok (6.929s) 2022-11-23T02:12:26.7035806Z 2022-11-23T02:12:26.7036062Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7036176Z Ran 1 test in 6.929s 2022-11-23T02:12:26.7036195Z 2022-11-23T02:12:26.7036288Z OK 2022-11-23T02:12:26.7036307Z 2022-11-23T02:12:26.7036433Z Generating XML reports... 2022-11-23T02:12:26.7036951Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015844.xml 2022-11-23T02:12:26.7037330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7037512Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7037899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7038075Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7038096Z 2022-11-23T02:12:26.7038206Z Running tests... 2022-11-23T02:12:26.7038470Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7038785Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7039063Z test_ddp_hook_parity_post_localSGD (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7039288Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29180 2022-11-23T02:12:26.7039506Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29181 2022-11-23T02:12:26.7039882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7040042Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7040425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7040618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7040983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7041156Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7041532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7041727Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7041976Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7042229Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7042616Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7043015Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7043248Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7043528Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T02:12:26.7043763Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7044034Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T02:12:26.7044343Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxvahmy8g 2022-11-23T02:12:26.7044626Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxvahmy8g/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7044883Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmgi132js 2022-11-23T02:12:26.7045135Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmgi132js/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7045410Z [1669168739.053123] [08317a7e7676:29180:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7045640Z [1669168739.066799] [08317a7e7676:29180:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7045928Z [1669168739.066799] [08317a7e7676:29180:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7046204Z [1669168739.059330] [08317a7e7676:29181:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7046436Z [1669168739.072804] [08317a7e7676:29181:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7046674Z [1669168739.072804] [08317a7e7676:29181:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7046914Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7047154Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7047391Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7047612Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7047903Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T02:12:26.7048182Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T02:12:26.7048461Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T02:12:26.7048732Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 10 iterations 2022-11-23T02:12:26.7048968Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7049202Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7049439Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7049654Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7049934Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T02:12:26.7050209Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Start to apply local SGD after 10 iterations. 2022-11-23T02:12:26.7050487Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 1000 iterations 2022-11-23T02:12:26.7050756Z INFO:torch.distributed.algorithms.ddp_comm_hooks.post_localSGD_hook:Local SGD will be started after 1000 iterations 2022-11-23T02:12:26.7050990Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7051226Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7051459Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7051689Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7051772Z ok (7.331s) 2022-11-23T02:12:26.7051791Z 2022-11-23T02:12:26.7052112Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7052232Z Ran 1 test in 7.331s 2022-11-23T02:12:26.7052252Z 2022-11-23T02:12:26.7052348Z OK 2022-11-23T02:12:26.7052366Z 2022-11-23T02:12:26.7052490Z Generating XML reports... 2022-11-23T02:12:26.7052944Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015853.xml 2022-11-23T02:12:26.7053321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7053501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7053955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7054129Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7054148Z 2022-11-23T02:12:26.7054255Z Running tests... 2022-11-23T02:12:26.7054523Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7054834Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7055105Z test_ddp_hook_parity_powerSGD (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7055860Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77378 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.615s) 2022-11-23T02:12:26.7055884Z 2022-11-23T02:12:26.7056149Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7056261Z Ran 1 test in 1.615s 2022-11-23T02:12:26.7056281Z 2022-11-23T02:12:26.7056388Z OK (skipped=1) 2022-11-23T02:12:26.7056407Z 2022-11-23T02:12:26.7056512Z Generating XML reports... 2022-11-23T02:12:26.7056964Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015903.xml 2022-11-23T02:12:26.7057340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7057516Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7057901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7058094Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7058113Z 2022-11-23T02:12:26.7058226Z Running tests... 2022-11-23T02:12:26.7058490Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7058801Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7059062Z test_ddp_hook_pickling_powerSGD (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7059285Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29332 2022-11-23T02:12:26.7059503Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29333 2022-11-23T02:12:26.7059874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7060048Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7060428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7060617Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7060988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7061144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7061576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7061777Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7062028Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7062278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7062686Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7063085Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7063364Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7063929Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 4; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:12:26.7064162Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7064707Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 4; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:12:26.7064957Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppxy2ogdc 2022-11-23T02:12:26.7065231Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppxy2ogdc/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7065491Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphlu3b_zc 2022-11-23T02:12:26.7065758Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphlu3b_zc/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7065995Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7066233Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7066510Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Start to apply PowerSGD after 4 iterations. 2022-11-23T02:12:26.7066787Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Start to apply PowerSGD after 4 iterations. 2022-11-23T02:12:26.7067093Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:A zero tensor of length 10 that represents local error is created. 2022-11-23T02:12:26.7067379Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:A zero tensor of length 10 that represents local error is created. 2022-11-23T02:12:26.7067720Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Compression stats: iter 4, total before compression 10, total after compression 10, rate 1.0 2022-11-23T02:12:26.7068052Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Compression stats: iter 4, total before compression 10, total after compression 10, rate 1.0 2022-11-23T02:12:26.7068380Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Allocating contiguous memory of length 0 for Ps, and of length 0 for Qs, respectively. 2022-11-23T02:12:26.7068697Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:Allocating contiguous memory of length 0 for Ps, and of length 0 for Qs, respectively. 2022-11-23T02:12:26.7069191Z [1669168753.060613] [08317a7e7676:29332:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7069511Z [1669168753.074447] [08317a7e7676:29332:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7069760Z [1669168753.074447] [08317a7e7676:29332:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7070035Z [1669168753.068002] [08317a7e7676:29333:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7070265Z [1669168753.081376] [08317a7e7676:29333:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7070499Z [1669168753.081376] [08317a7e7676:29333:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7070788Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7071022Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7071123Z ok (6.728s) 2022-11-23T02:12:26.7071146Z 2022-11-23T02:12:26.7071425Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7071540Z Ran 1 test in 6.728s 2022-11-23T02:12:26.7071559Z 2022-11-23T02:12:26.7071652Z OK 2022-11-23T02:12:26.7071671Z 2022-11-23T02:12:26.7071795Z Generating XML reports... 2022-11-23T02:12:26.7072250Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015907.xml 2022-11-23T02:12:26.7072612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7072791Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7073180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7073372Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7073392Z 2022-11-23T02:12:26.7073500Z Running tests... 2022-11-23T02:12:26.7073759Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7074072Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7074473Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:12:26.7074493Z 2022-11-23T02:12:26.7074753Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7074845Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7074879Z 2022-11-23T02:12:26.7074971Z OK (skipped=1) 2022-11-23T02:12:26.7074990Z 2022-11-23T02:12:26.7075111Z Generating XML reports... 2022-11-23T02:12:26.7075613Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015917.xml 2022-11-23T02:12:26.7075995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7076173Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7076556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7076748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7076767Z 2022-11-23T02:12:26.7076874Z Running tests... 2022-11-23T02:12:26.7077118Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7077427Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7077827Z test_ddp_hook_with_optimizer_parity_adam_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:12:26.7077848Z 2022-11-23T02:12:26.7078157Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7078275Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7078295Z 2022-11-23T02:12:26.7078402Z OK (skipped=1) 2022-11-23T02:12:26.7078422Z 2022-11-23T02:12:26.7078546Z Generating XML reports... 2022-11-23T02:12:26.7079000Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015919.xml 2022-11-23T02:12:26.7079379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7079540Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7079923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7080164Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7080184Z 2022-11-23T02:12:26.7080291Z Running tests... 2022-11-23T02:12:26.7080559Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7080873Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7081326Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:12:26.7081346Z 2022-11-23T02:12:26.7081605Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7081714Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7081733Z 2022-11-23T02:12:26.7081842Z OK (skipped=1) 2022-11-23T02:12:26.7081861Z 2022-11-23T02:12:26.7081967Z Generating XML reports... 2022-11-23T02:12:26.7082413Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015921.xml 2022-11-23T02:12:26.7082789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7082965Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7083348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7083540Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7083559Z 2022-11-23T02:12:26.7083665Z Running tests... 2022-11-23T02:12:26.7083927Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7084222Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7084679Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:12:26.7084699Z 2022-11-23T02:12:26.7084963Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7085074Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7085093Z 2022-11-23T02:12:26.7085201Z OK (skipped=1) 2022-11-23T02:12:26.7085220Z 2022-11-23T02:12:26.7085341Z Generating XML reports... 2022-11-23T02:12:26.7085786Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015924.xml 2022-11-23T02:12:26.7086162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7086339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7086729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7086904Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7086923Z 2022-11-23T02:12:26.7087125Z Running tests... 2022-11-23T02:12:26.7087402Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7087717Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7088171Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:12:26.7088191Z 2022-11-23T02:12:26.7088454Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7088566Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7088629Z 2022-11-23T02:12:26.7088744Z OK (skipped=1) 2022-11-23T02:12:26.7088762Z 2022-11-23T02:12:26.7088886Z Generating XML reports... 2022-11-23T02:12:26.7089320Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015926.xml 2022-11-23T02:12:26.7089745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7089925Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7090312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7090507Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7090526Z 2022-11-23T02:12:26.7090633Z Running tests... 2022-11-23T02:12:26.7090895Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7091211Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7091670Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_False_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:12:26.7091690Z 2022-11-23T02:12:26.7091934Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7092044Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7092063Z 2022-11-23T02:12:26.7092170Z OK (skipped=1) 2022-11-23T02:12:26.7092189Z 2022-11-23T02:12:26.7092310Z Generating XML reports... 2022-11-23T02:12:26.7092753Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015929.xml 2022-11-23T02:12:26.7093125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7093300Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7093689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7093882Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7093904Z 2022-11-23T02:12:26.7093995Z Running tests... 2022-11-23T02:12:26.7094257Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7094570Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7095018Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:12:26.7095037Z 2022-11-23T02:12:26.7095295Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7095410Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7095429Z 2022-11-23T02:12:26.7095536Z OK (skipped=1) 2022-11-23T02:12:26.7095554Z 2022-11-23T02:12:26.7095676Z Generating XML reports... 2022-11-23T02:12:26.7096171Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015931.xml 2022-11-23T02:12:26.7096535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7096715Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7097098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7097291Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7097310Z 2022-11-23T02:12:26.7097417Z Running tests... 2022-11-23T02:12:26.7097677Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7098042Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7098497Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_False_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:12:26.7098517Z 2022-11-23T02:12:26.7098778Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7098871Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7098910Z 2022-11-23T02:12:26.7098998Z OK (skipped=1) 2022-11-23T02:12:26.7099017Z 2022-11-23T02:12:26.7099138Z Generating XML reports... 2022-11-23T02:12:26.7099581Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015933.xml 2022-11-23T02:12:26.7099954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7100132Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7100514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7100705Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7100725Z 2022-11-23T02:12:26.7100831Z Running tests... 2022-11-23T02:12:26.7101076Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7101389Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7101837Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:12:26.7101857Z 2022-11-23T02:12:26.7102118Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7102232Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7102250Z 2022-11-23T02:12:26.7102355Z OK (skipped=1) 2022-11-23T02:12:26.7102374Z 2022-11-23T02:12:26.7102495Z Generating XML reports... 2022-11-23T02:12:26.7102947Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015936.xml 2022-11-23T02:12:26.7103326Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7103502Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7103869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7104061Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7104080Z 2022-11-23T02:12:26.7104190Z Running tests... 2022-11-23T02:12:26.7104449Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7104767Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7105263Z test_ddp_hook_with_optimizer_parity_adamw_grad_as_bucket_view_True_static_graph_True_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:12:26.7105285Z 2022-11-23T02:12:26.7105554Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7105665Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7105684Z 2022-11-23T02:12:26.7105791Z OK (skipped=1) 2022-11-23T02:12:26.7105809Z 2022-11-23T02:12:26.7105914Z Generating XML reports... 2022-11-23T02:12:26.7106363Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015938.xml 2022-11-23T02:12:26.7106739Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7106968Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7107355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7107550Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7107570Z 2022-11-23T02:12:26.7107679Z Running tests... 2022-11-23T02:12:26.7107942Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7108239Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7108636Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_False (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:12:26.7108655Z 2022-11-23T02:12:26.7108919Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7109252Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7109272Z 2022-11-23T02:12:26.7109380Z OK (skipped=1) 2022-11-23T02:12:26.7109400Z 2022-11-23T02:12:26.7109525Z Generating XML reports... 2022-11-23T02:12:26.7109991Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015941.xml 2022-11-23T02:12:26.7110371Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7110548Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7110930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7111105Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7111124Z 2022-11-23T02:12:26.7111231Z Running tests... 2022-11-23T02:12:26.7111499Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7111821Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7112218Z test_ddp_hook_with_optimizer_parity_sgd_optimize_subset_True (__main__.TestDistBackendWithSpawn) ... skip: Issues with async error handling, see https://github.com/pytorch/pytorch/issues/73259 (0.002s) 2022-11-23T02:12:26.7112238Z 2022-11-23T02:12:26.7112500Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7112608Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7112627Z 2022-11-23T02:12:26.7112733Z OK (skipped=1) 2022-11-23T02:12:26.7112752Z 2022-11-23T02:12:26.7112875Z Generating XML reports... 2022-11-23T02:12:26.7113303Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015943.xml 2022-11-23T02:12:26.7113677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7113855Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7114238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7114502Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7114524Z 2022-11-23T02:12:26.7114640Z Running tests... 2022-11-23T02:12:26.7114907Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7115222Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7115475Z test_ddp_ignore_params_arg (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7116225Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77325 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.663s) 2022-11-23T02:12:26.7116335Z 2022-11-23T02:12:26.7116584Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7116698Z Ran 1 test in 1.664s 2022-11-23T02:12:26.7116718Z 2022-11-23T02:12:26.7116831Z OK (skipped=1) 2022-11-23T02:12:26.7116850Z 2022-11-23T02:12:26.7116973Z Generating XML reports... 2022-11-23T02:12:26.7117418Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015945.xml 2022-11-23T02:12:26.7117791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7117968Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7118352Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7118527Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7118567Z 2022-11-23T02:12:26.7118658Z Running tests... 2022-11-23T02:12:26.7118921Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7119237Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7119495Z test_ddp_inference (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7119716Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29880 2022-11-23T02:12:26.7119937Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29881 2022-11-23T02:12:26.7120313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7120471Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7120851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7121046Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7121416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7121589Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7121967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7122156Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7122406Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7122654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7123042Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7123450Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7123685Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7123965Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7124232Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3sv_4rfs 2022-11-23T02:12:26.7124503Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3sv_4rfs/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7124761Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp97zw1y_b 2022-11-23T02:12:26.7125033Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp97zw1y_b/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7125309Z [1669168795.386422] [08317a7e7676:29880:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7125572Z [1669168795.400125] [08317a7e7676:29880:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7125814Z [1669168795.400125] [08317a7e7676:29880:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7126089Z [1669168795.390318] [08317a7e7676:29881:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7126321Z [1669168795.403815] [08317a7e7676:29881:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7126555Z [1669168795.403815] [08317a7e7676:29881:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7126657Z ok (7.053s) 2022-11-23T02:12:26.7126676Z 2022-11-23T02:12:26.7126947Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7127062Z Ran 1 test in 7.053s 2022-11-23T02:12:26.7127082Z 2022-11-23T02:12:26.7127174Z OK 2022-11-23T02:12:26.7127193Z 2022-11-23T02:12:26.7127298Z Generating XML reports... 2022-11-23T02:12:26.7127755Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015950.xml 2022-11-23T02:12:26.7128135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7128312Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7128695Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7128885Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7128904Z 2022-11-23T02:12:26.7129010Z Running tests... 2022-11-23T02:12:26.7129275Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7129594Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7129855Z test_ddp_join_model_equivalence (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7130080Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 29994 2022-11-23T02:12:26.7130301Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 29995 2022-11-23T02:12:26.7130674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7130850Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7131231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7131420Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7131789Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7131946Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7132370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7132564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7132814Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7133062Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7133471Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7133869Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7134152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7134385Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7134633Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsifai262 2022-11-23T02:12:26.7134907Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsifai262/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7135163Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp14t7jtwr 2022-11-23T02:12:26.7135432Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp14t7jtwr/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7135672Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7135911Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7136190Z [1669168805.375413] [08317a7e7676:29994:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7136422Z [1669168805.388614] [08317a7e7676:29994:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7136662Z [1669168805.388614] [08317a7e7676:29994:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7136933Z [1669168805.380323] [08317a7e7676:29995:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7137146Z [1669168805.393295] [08317a7e7676:29995:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7137383Z [1669168805.393295] [08317a7e7676:29995:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7137797Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-11-23T02:12:26.7137961Z _warnings.warn(warn_message, ResourceWarning) 2022-11-23T02:12:26.7138370Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-11-23T02:12:26.7138535Z _warnings.warn(warn_message, ResourceWarning) 2022-11-23T02:12:26.7138640Z ok (6.657s) 2022-11-23T02:12:26.7138659Z 2022-11-23T02:12:26.7138924Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7139018Z Ran 1 test in 6.657s 2022-11-23T02:12:26.7139053Z 2022-11-23T02:12:26.7139127Z OK 2022-11-23T02:12:26.7139146Z 2022-11-23T02:12:26.7139268Z Generating XML reports... 2022-11-23T02:12:26.7139717Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015959.xml 2022-11-23T02:12:26.7140089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7140271Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7140656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7140898Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7140919Z 2022-11-23T02:12:26.7141033Z Running tests... 2022-11-23T02:12:26.7141281Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7141596Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7141865Z test_ddp_logging_data_cpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7142087Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30112 2022-11-23T02:12:26.7142307Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30113 2022-11-23T02:12:26.7142732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7142908Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7143301Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7143478Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7143847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7144019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7144397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7144587Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7144838Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7145089Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7145499Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7145901Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7146116Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7146373Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp55c8iirm 2022-11-23T02:12:26.7146643Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp55c8iirm/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7146871Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7147132Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphzy1o7o7 2022-11-23T02:12:26.7147399Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphzy1o7o7/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7147642Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7147881Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7148156Z [1669168812.796945] [08317a7e7676:30113:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7148372Z [1669168814.216357] [08317a7e7676:30113:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7148610Z [1669168814.216357] [08317a7e7676:30113:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7148886Z [1669168812.775557] [08317a7e7676:30112:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7149402Z [1669168814.204581] [08317a7e7676:30112:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7149653Z [1669168814.204581] [08317a7e7676:30112:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7149757Z ok (6.274s) 2022-11-23T02:12:26.7149778Z 2022-11-23T02:12:26.7150054Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7150167Z Ran 1 test in 6.274s 2022-11-23T02:12:26.7150186Z 2022-11-23T02:12:26.7150278Z OK 2022-11-23T02:12:26.7150296Z 2022-11-23T02:12:26.7150403Z Generating XML reports... 2022-11-23T02:12:26.7150856Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020008.xml 2022-11-23T02:12:26.7151306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7151485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7151871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7152064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7152084Z 2022-11-23T02:12:26.7152193Z Running tests... 2022-11-23T02:12:26.7152457Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7152768Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7153017Z test_ddp_logging_data_gpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7153239Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30256 2022-11-23T02:12:26.7153462Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30257 2022-11-23T02:12:26.7153837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7154012Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7154394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7154585Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7154952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7155108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7155485Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7155679Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7155929Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7156174Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7156580Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7156981Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7157214Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7157443Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7157687Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdz01zsbk 2022-11-23T02:12:26.7157965Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdz01zsbk/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7158224Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzfn4pvgo 2022-11-23T02:12:26.7158543Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzfn4pvgo/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7158829Z [1669168822.985388] [08317a7e7676:30256:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7159062Z [1669168822.998845] [08317a7e7676:30256:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7159301Z [1669168822.998845] [08317a7e7676:30256:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7159542Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7159821Z [1669168822.991868] [08317a7e7676:30257:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7160104Z [1669168823.005073] [08317a7e7676:30257:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7160327Z [1669168823.005073] [08317a7e7676:30257:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7160569Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7160670Z ok (6.653s) 2022-11-23T02:12:26.7160689Z 2022-11-23T02:12:26.7160964Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7161076Z Ran 1 test in 6.654s 2022-11-23T02:12:26.7161096Z 2022-11-23T02:12:26.7161189Z OK 2022-11-23T02:12:26.7161207Z 2022-11-23T02:12:26.7161333Z Generating XML reports... 2022-11-23T02:12:26.7161785Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020017.xml 2022-11-23T02:12:26.7162146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7162325Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7162710Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7162903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7162922Z 2022-11-23T02:12:26.7163028Z Running tests... 2022-11-23T02:12:26.7163291Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7163606Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7163894Z test_ddp_model_diff_num_params_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7164119Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30374 2022-11-23T02:12:26.7164322Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30375 2022-11-23T02:12:26.7164698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7164873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7165258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7165452Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7165820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7165994Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7166375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7166553Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7166803Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7167099Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7167514Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7167916Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7168151Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7168384Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7168627Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.7168923Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.7169312Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7169710Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7169956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:12:26.7170198Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:12:26.7170590Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:12:26.7170976Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:12:26.7171241Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpihcmcvpr 2022-11-23T02:12:26.7171520Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpihcmcvpr/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7171777Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0wcj_nl7 2022-11-23T02:12:26.7172046Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0wcj_nl7/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7172302Z [1669168832.302151] [08317a7e7676:30375:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7172537Z [1669168832.315543] [08317a7e7676:30375:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7172774Z [1669168832.315543] [08317a7e7676:30375:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7173047Z [1669168832.297003] [08317a7e7676:30374:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7173278Z [1669168832.310955] [08317a7e7676:30374:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7173512Z [1669168832.310955] [08317a7e7676:30374:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7173613Z ok (6.254s) 2022-11-23T02:12:26.7173632Z 2022-11-23T02:12:26.7173904Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7174013Z Ran 1 test in 6.254s 2022-11-23T02:12:26.7174032Z 2022-11-23T02:12:26.7174105Z OK 2022-11-23T02:12:26.7174141Z 2022-11-23T02:12:26.7174248Z Generating XML reports... 2022-11-23T02:12:26.7174698Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020026.xml 2022-11-23T02:12:26.7175078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7175255Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7175685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7175885Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7175905Z 2022-11-23T02:12:26.7176014Z Running tests... 2022-11-23T02:12:26.7176278Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7176577Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7176863Z test_ddp_model_diff_shape_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7177086Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30494 2022-11-23T02:12:26.7177366Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30495 2022-11-23T02:12:26.7177742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7177921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7178305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7178496Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7178846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7179021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7179397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7179589Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7179837Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7180086Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7180491Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7180890Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7181121Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7181334Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7181574Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.7181822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.7182224Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7182624Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7182867Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:12:26.7183104Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:12:26.7183505Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:12:26.7183901Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:12:26.7184149Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4799dbx8 2022-11-23T02:12:26.7184421Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4799dbx8/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7184727Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr0s4ebu_ 2022-11-23T02:12:26.7185006Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr0s4ebu_/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7185282Z [1669168841.126864] [08317a7e7676:30495:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7185518Z [1669168841.140303] [08317a7e7676:30495:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7185756Z [1669168841.140303] [08317a7e7676:30495:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7186199Z [1669168851.525349] [08317a7e7676:30495:0] tag_match.c:62 UCX WARN unexpected tag-receive descriptor 0x557d85c750c0 was not matched 2022-11-23T02:12:26.7186471Z [1669168841.122287] [08317a7e7676:30494:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7186699Z [1669168841.136355] [08317a7e7676:30494:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7186912Z [1669168841.136355] [08317a7e7676:30494:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7187231Z [1669168851.494381] [08317a7e7676:30494:1] ucc_schedule.h:189 UCC WARN timeout 10 sec. has expired on req 0x55d8826d6900, seq_num 3, TL_UCP, team_id 1, size 2, rank 0, ctx_rank 0: Barrier n/a inplace=0 bytes=0 2022-11-23T02:12:26.7187515Z [1669168851.535465] [08317a7e7676:30494:0] mpool.c:55 UCX WARN object 0x55d8827e7cc0 {flags:0x20040 recv length 0 host memory} was not returned to mpool ucp_requests 2022-11-23T02:12:26.7187621Z ok (16.268s) 2022-11-23T02:12:26.7187640Z 2022-11-23T02:12:26.7187909Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7188024Z Ran 1 test in 16.268s 2022-11-23T02:12:26.7188044Z 2022-11-23T02:12:26.7188135Z OK 2022-11-23T02:12:26.7188154Z 2022-11-23T02:12:26.7188278Z Generating XML reports... 2022-11-23T02:12:26.7188727Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020035.xml 2022-11-23T02:12:26.7189322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7189489Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7189916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7190116Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7190135Z 2022-11-23T02:12:26.7190244Z Running tests... 2022-11-23T02:12:26.7190507Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7190824Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7191135Z test_ddp_multiple_nested_unused_params_err_ignore_params (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7191355Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30614 2022-11-23T02:12:26.7191557Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30615 2022-11-23T02:12:26.7191931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7192109Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7192498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7192687Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7193128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7193314Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7193692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7193885Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7194118Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7194368Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7194772Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7195241Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7195481Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7195718Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7195979Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn52ypwrx 2022-11-23T02:12:26.7196254Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn52ypwrx/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7196511Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_swl4pk6 2022-11-23T02:12:26.7196764Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_swl4pk6/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7197047Z [1669168859.874516] [08317a7e7676:30614:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7197284Z [1669168859.888311] [08317a7e7676:30614:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7197521Z [1669168859.888311] [08317a7e7676:30614:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7197794Z [1669168859.876864] [08317a7e7676:30615:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7198022Z [1669168859.890271] [08317a7e7676:30615:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7198257Z [1669168859.890271] [08317a7e7676:30615:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7198360Z ok (7.071s) 2022-11-23T02:12:26.7198380Z 2022-11-23T02:12:26.7198650Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7198744Z Ran 1 test in 7.071s 2022-11-23T02:12:26.7198781Z 2022-11-23T02:12:26.7198855Z OK 2022-11-23T02:12:26.7198873Z 2022-11-23T02:12:26.7198999Z Generating XML reports... 2022-11-23T02:12:26.7199452Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020054.xml 2022-11-23T02:12:26.7199824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7200001Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7200386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7200579Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7200601Z 2022-11-23T02:12:26.7200709Z Running tests... 2022-11-23T02:12:26.7200957Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7201270Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7201616Z test_ddp_multiple_nested_unused_params_error (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7201846Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30732 2022-11-23T02:12:26.7202067Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30733 2022-11-23T02:12:26.7202442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7202617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7202998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7203220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7203587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7203767Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7204146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7204336Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7204585Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7204834Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7205237Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7205641Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7205858Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7206093Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7206354Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp_99ltmq 2022-11-23T02:12:26.7206629Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp_99ltmq/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7206883Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3e_utp9m 2022-11-23T02:12:26.7207153Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3e_utp9m/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7207428Z [1669168869.434287] [08317a7e7676:30732:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7207663Z [1669168869.447993] [08317a7e7676:30732:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7207903Z [1669168869.447993] [08317a7e7676:30732:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7208177Z [1669168869.441503] [08317a7e7676:30733:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7208389Z [1669168869.454986] [08317a7e7676:30733:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7208624Z [1669168869.454986] [08317a7e7676:30733:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7208725Z ok (6.927s) 2022-11-23T02:12:26.7208744Z 2022-11-23T02:12:26.7209011Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7209124Z Ran 1 test in 6.927s 2022-11-23T02:12:26.7209143Z 2022-11-23T02:12:26.7209232Z OK 2022-11-23T02:12:26.7209251Z 2022-11-23T02:12:26.7209374Z Generating XML reports... 2022-11-23T02:12:26.7209872Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020104.xml 2022-11-23T02:12:26.7210240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7210419Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7210799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7210992Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7211011Z 2022-11-23T02:12:26.7211120Z Running tests... 2022-11-23T02:12:26.7211382Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7211744Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7212003Z test_ddp_namedtuple (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7212228Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30850 2022-11-23T02:12:26.7212432Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30851 2022-11-23T02:12:26.7212809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7212984Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7213369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7213559Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7213931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7214108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7214492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7214665Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7214912Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7215157Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7215561Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7215962Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7216196Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7216425Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7216687Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6da60si_ 2022-11-23T02:12:26.7216957Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6da60si_/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7217194Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3gstbi35 2022-11-23T02:12:26.7217463Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3gstbi35/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7217738Z [1669168878.986784] [08317a7e7676:30851:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7217968Z [1669168879.000089] [08317a7e7676:30851:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7218212Z [1669168879.000089] [08317a7e7676:30851:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7218530Z [1669168878.983213] [08317a7e7676:30850:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7218766Z [1669168878.997094] [08317a7e7676:30850:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7219002Z [1669168878.997094] [08317a7e7676:30850:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7219106Z ok (6.720s) 2022-11-23T02:12:26.7219126Z 2022-11-23T02:12:26.7219393Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7219488Z Ran 1 test in 6.720s 2022-11-23T02:12:26.7219507Z 2022-11-23T02:12:26.7219598Z OK 2022-11-23T02:12:26.7219710Z 2022-11-23T02:12:26.7219840Z Generating XML reports... 2022-11-23T02:12:26.7220293Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020113.xml 2022-11-23T02:12:26.7220674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7220853Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7221236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7221431Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7221451Z 2022-11-23T02:12:26.7221557Z Running tests... 2022-11-23T02:12:26.7221803Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7222116Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7222386Z test_ddp_new_tensor_in_fwd (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7222609Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 30964 2022-11-23T02:12:26.7222828Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 30965 2022-11-23T02:12:26.7223205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7223381Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7223764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7223939Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7224304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7224478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7224860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7225045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7225297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7225543Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7225947Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7226345Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7226560Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7226792Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7227053Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwroq5q18 2022-11-23T02:12:26.7227324Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwroq5q18/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7227624Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzo_3_na7 2022-11-23T02:12:26.7227897Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzo_3_na7/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7228172Z [1669168888.280206] [08317a7e7676:30964:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7228402Z [1669168888.293998] [08317a7e7676:30964:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7228641Z [1669168888.293998] [08317a7e7676:30964:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7229711Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:12:26.7229992Z [1669168888.284909] [08317a7e7676:30965:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7230206Z [1669168888.298460] [08317a7e7676:30965:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7230444Z [1669168888.298460] [08317a7e7676:30965:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7231244Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:12:26.7231489Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7231723Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7231823Z ok (6.664s) 2022-11-23T02:12:26.7231843Z 2022-11-23T02:12:26.7232120Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7232234Z Ran 1 test in 6.664s 2022-11-23T02:12:26.7232254Z 2022-11-23T02:12:26.7232346Z OK 2022-11-23T02:12:26.7232365Z 2022-11-23T02:12:26.7232489Z Generating XML reports... 2022-11-23T02:12:26.7232927Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020122.xml 2022-11-23T02:12:26.7233307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7233484Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7233869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7234060Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7234079Z 2022-11-23T02:12:26.7234186Z Running tests... 2022-11-23T02:12:26.7234448Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7234768Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7235055Z test_ddp_new_tensor_in_fwd_static_graph (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7235883Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78338 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.668s) 2022-11-23T02:12:26.7235907Z 2022-11-23T02:12:26.7236161Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7236273Z Ran 1 test in 1.668s 2022-11-23T02:12:26.7236292Z 2022-11-23T02:12:26.7236400Z OK (skipped=1) 2022-11-23T02:12:26.7236419Z 2022-11-23T02:12:26.7236544Z Generating XML reports... 2022-11-23T02:12:26.7236993Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020132.xml 2022-11-23T02:12:26.7237446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7237626Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7238009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7238201Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7238220Z 2022-11-23T02:12:26.7238309Z Running tests... 2022-11-23T02:12:26.7238572Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7238885Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7239169Z test_ddp_profiling_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7239923Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77342 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.598s) 2022-11-23T02:12:26.7239944Z 2022-11-23T02:12:26.7240201Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7240313Z Ran 1 test in 1.598s 2022-11-23T02:12:26.7240332Z 2022-11-23T02:12:26.7240442Z OK (skipped=1) 2022-11-23T02:12:26.7240460Z 2022-11-23T02:12:26.7240582Z Generating XML reports... 2022-11-23T02:12:26.7241014Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020136.xml 2022-11-23T02:12:26.7241389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7241565Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7241952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7242144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7242167Z 2022-11-23T02:12:26.7242272Z Running tests... 2022-11-23T02:12:26.7242537Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7242851Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7243129Z test_ddp_profiling_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7243335Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31150 2022-11-23T02:12:26.7243556Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31151 2022-11-23T02:12:26.7243930Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7244106Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7244491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7244736Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7245117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7245291Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7245669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7245841Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7246091Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7246388Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7246792Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7247195Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7247428Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7247649Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7247906Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmp93y76j 2022-11-23T02:12:26.7248185Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmp93y76j/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7248440Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpckuo7bmf 2022-11-23T02:12:26.7248718Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpckuo7bmf/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7248997Z [1669168905.795653] [08317a7e7676:31150:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7249232Z [1669168905.809459] [08317a7e7676:31150:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7249470Z [1669168905.809459] [08317a7e7676:31150:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7249798Z STAGE:2022-11-23 02:01:46 31150:31150 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7250073Z [1669168905.798164] [08317a7e7676:31151:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7250306Z [1669168905.811553] [08317a7e7676:31151:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7250540Z [1669168905.811553] [08317a7e7676:31151:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7250886Z STAGE:2022-11-23 02:01:46 31151:31151 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7251126Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7251361Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7251702Z STAGE:2022-11-23 02:01:47 31150:31150 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7252034Z STAGE:2022-11-23 02:01:47 31151:31151 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7252365Z STAGE:2022-11-23 02:01:47 31151:31151 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7252720Z STAGE:2022-11-23 02:01:47 31150:31150 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7253568Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:12:26.7254367Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:12:26.7254751Z STAGE:2022-11-23 02:01:47 31150:31150 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7255081Z STAGE:2022-11-23 02:01:47 31151:31151 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7255418Z STAGE:2022-11-23 02:01:47 31150:31150 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7255768Z STAGE:2022-11-23 02:01:47 31150:31150 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7256101Z STAGE:2022-11-23 02:01:47 31151:31151 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7256445Z STAGE:2022-11-23 02:01:47 31151:31151 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7256550Z ok (7.652s) 2022-11-23T02:12:26.7256569Z 2022-11-23T02:12:26.7256843Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7256955Z Ran 1 test in 7.652s 2022-11-23T02:12:26.7256974Z 2022-11-23T02:12:26.7257047Z OK 2022-11-23T02:12:26.7257069Z 2022-11-23T02:12:26.7257192Z Generating XML reports... 2022-11-23T02:12:26.7257646Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020140.xml 2022-11-23T02:12:26.7258022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7258199Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7258584Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7258781Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7258803Z 2022-11-23T02:12:26.7258909Z Running tests... 2022-11-23T02:12:26.7259154Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7259468Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7259742Z test_ddp_python_error_logged (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7259964Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31272 2022-11-23T02:12:26.7260181Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31273 2022-11-23T02:12:26.7260551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7260728Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7261112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7261307Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7261656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7261875Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7262272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7262461Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7262710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7262958Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7263362Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7263814Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7264048Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7264266Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7264527Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwwwiv8z4 2022-11-23T02:12:26.7264798Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwwwiv8z4/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7265060Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoy3lau6k 2022-11-23T02:12:26.7265331Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoy3lau6k/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7265606Z [1669168915.993368] [08317a7e7676:31273:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7265843Z [1669168916.006646] [08317a7e7676:31273:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7266081Z [1669168916.006646] [08317a7e7676:31273:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7266355Z [1669168915.989623] [08317a7e7676:31272:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7266567Z [1669168916.003391] [08317a7e7676:31272:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7266803Z [1669168916.003391] [08317a7e7676:31272:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7266903Z ok (6.163s) 2022-11-23T02:12:26.7266923Z 2022-11-23T02:12:26.7267193Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7267309Z Ran 1 test in 6.163s 2022-11-23T02:12:26.7267328Z 2022-11-23T02:12:26.7267418Z OK 2022-11-23T02:12:26.7267437Z 2022-11-23T02:12:26.7267560Z Generating XML reports... 2022-11-23T02:12:26.7268014Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020150.xml 2022-11-23T02:12:26.7268388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7268548Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7269151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7269360Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7269380Z 2022-11-23T02:12:26.7269488Z Running tests... 2022-11-23T02:12:26.7269758Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7270081Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7270363Z test_ddp_returns_tensor_with_no_grad (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7271188Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78595 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.604s) 2022-11-23T02:12:26.7271212Z 2022-11-23T02:12:26.7271482Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7271574Z Ran 1 test in 1.604s 2022-11-23T02:12:26.7271611Z 2022-11-23T02:12:26.7271700Z OK (skipped=1) 2022-11-23T02:12:26.7271719Z 2022-11-23T02:12:26.7271846Z Generating XML reports... 2022-11-23T02:12:26.7272297Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020159.xml 2022-11-23T02:12:26.7272736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7272917Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7273302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7273497Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7273517Z 2022-11-23T02:12:26.7273623Z Running tests... 2022-11-23T02:12:26.7273868Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7274182Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7274467Z test_ddp_shared_grad_acc_unused_params (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7274692Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31420 2022-11-23T02:12:26.7274912Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31421 2022-11-23T02:12:26.7275288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7275465Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7275852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7276045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7276400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7276575Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7276951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7277144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7277392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7277646Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7278050Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7278453Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7278665Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7278898Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7279157Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcrqlzk7z 2022-11-23T02:12:26.7279436Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcrqlzk7z/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7279739Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppx02_tfr 2022-11-23T02:12:26.7280018Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppx02_tfr/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7280944Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:12:26.7281059Z warnings.warn( 2022-11-23T02:12:26.7281984Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:12:26.7282148Z warnings.warn( 2022-11-23T02:12:26.7282388Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7282607Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7282882Z [1669168928.781173] [08317a7e7676:31420:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7283113Z [1669168928.794789] [08317a7e7676:31420:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7283353Z [1669168928.794789] [08317a7e7676:31420:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7283630Z [1669168928.783922] [08317a7e7676:31421:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7283860Z [1669168928.797305] [08317a7e7676:31421:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7284096Z [1669168928.797305] [08317a7e7676:31421:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7284198Z ok (6.635s) 2022-11-23T02:12:26.7284217Z 2022-11-23T02:12:26.7284490Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7284584Z Ran 1 test in 6.635s 2022-11-23T02:12:26.7284622Z 2022-11-23T02:12:26.7284695Z OK 2022-11-23T02:12:26.7284714Z 2022-11-23T02:12:26.7284840Z Generating XML reports... 2022-11-23T02:12:26.7285291Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020203.xml 2022-11-23T02:12:26.7285672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7285852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7286237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7286431Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7286451Z 2022-11-23T02:12:26.7286558Z Running tests... 2022-11-23T02:12:26.7286801Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7287116Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7287397Z test_ddp_static_graph_nested_types (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7288199Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77625 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.611s) 2022-11-23T02:12:26.7288222Z 2022-11-23T02:12:26.7288484Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7288596Z Ran 1 test in 1.611s 2022-11-23T02:12:26.7288615Z 2022-11-23T02:12:26.7288721Z OK (skipped=1) 2022-11-23T02:12:26.7288739Z 2022-11-23T02:12:26.7288863Z Generating XML reports... 2022-11-23T02:12:26.7289313Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020212.xml 2022-11-23T02:12:26.7289725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7289887Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7290332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7290524Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7290544Z 2022-11-23T02:12:26.7290657Z Running tests... 2022-11-23T02:12:26.7290922Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7291236Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7291511Z test_ddp_sync_bn_training_vs_eval (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7291733Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31572 2022-11-23T02:12:26.7291938Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31573 2022-11-23T02:12:26.7292314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7292493Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7292879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7293074Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7293445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7293621Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7294000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7294193Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7294424Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7294679Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7295085Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7295488Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7295720Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7295950Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7296209Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7bjd04rk 2022-11-23T02:12:26.7296482Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7bjd04rk/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7296739Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq_2g3n97 2022-11-23T02:12:26.7296993Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq_2g3n97/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7297313Z [1669168942.141859] [08317a7e7676:31573:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7297556Z [1669168942.155156] [08317a7e7676:31573:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7297796Z [1669168942.155156] [08317a7e7676:31573:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7298139Z STAGE:2022-11-23 02:02:22 31573:31573 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7298414Z [1669168942.137313] [08317a7e7676:31572:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7298648Z [1669168942.151056] [08317a7e7676:31572:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7298948Z [1669168942.151056] [08317a7e7676:31572:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7299292Z STAGE:2022-11-23 02:02:22 31572:31572 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7299515Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7299757Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:12:26.7300100Z STAGE:2022-11-23 02:02:22 31573:31573 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7300431Z STAGE:2022-11-23 02:02:22 31572:31572 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7300779Z STAGE:2022-11-23 02:02:22 31572:31572 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7301130Z STAGE:2022-11-23 02:02:22 31573:31573 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7301457Z STAGE:2022-11-23 02:02:22 31572:31572 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7301798Z STAGE:2022-11-23 02:02:23 31572:31572 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7302145Z STAGE:2022-11-23 02:02:23 31572:31572 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7302230Z ok (7.552s) 2022-11-23T02:12:26.7302249Z 2022-11-23T02:12:26.7302518Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7302629Z Ran 1 test in 7.552s 2022-11-23T02:12:26.7302648Z 2022-11-23T02:12:26.7302738Z OK 2022-11-23T02:12:26.7302756Z 2022-11-23T02:12:26.7302879Z Generating XML reports... 2022-11-23T02:12:26.7303332Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020216.xml 2022-11-23T02:12:26.7303712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7303890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7304275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7304451Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7304470Z 2022-11-23T02:12:26.7304578Z Running tests... 2022-11-23T02:12:26.7304836Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7305153Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7305420Z test_ddp_sync_module_states (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7305640Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31694 2022-11-23T02:12:26.7305862Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31695 2022-11-23T02:12:26.7306241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7306445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7306833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7307024Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7307394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7307568Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7307948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7308187Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7308435Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7308685Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7309297Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7309712Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7309943Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7310171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7310434Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwurds7y_ 2022-11-23T02:12:26.7310710Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwurds7y_/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7310964Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqii_has0 2022-11-23T02:12:26.7311239Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqii_has0/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7311518Z [1669168952.193386] [08317a7e7676:31695:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7311733Z [1669168952.206809] [08317a7e7676:31695:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7311971Z [1669168952.206809] [08317a7e7676:31695:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7312246Z [1669168952.185995] [08317a7e7676:31694:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7312477Z [1669168952.199983] [08317a7e7676:31694:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7312716Z [1669168952.199983] [08317a7e7676:31694:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7312815Z ok (6.151s) 2022-11-23T02:12:26.7312835Z 2022-11-23T02:12:26.7313105Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7313217Z Ran 1 test in 6.152s 2022-11-23T02:12:26.7313236Z 2022-11-23T02:12:26.7313326Z OK 2022-11-23T02:12:26.7313345Z 2022-11-23T02:12:26.7313451Z Generating XML reports... 2022-11-23T02:12:26.7313901Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020226.xml 2022-11-23T02:12:26.7314275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7314459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7314845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7315110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7315132Z 2022-11-23T02:12:26.7315245Z Running tests... 2022-11-23T02:12:26.7315512Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7315828Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7316089Z test_ddp_uneven_input_exception (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7316312Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 31808 2022-11-23T02:12:26.7316532Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 31809 2022-11-23T02:12:26.7316970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7317145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7317535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7317724Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7318089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7318244Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7318622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7318811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7319068Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7319316Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7319726Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7320128Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7320362Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7320594Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7320836Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpexy4ylcl 2022-11-23T02:12:26.7321112Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpexy4ylcl/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7321373Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuwa4fm2v 2022-11-23T02:12:26.7321641Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuwa4fm2v/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7321921Z [1669168960.908507] [08317a7e7676:31809:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7322153Z [1669168960.922071] [08317a7e7676:31809:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7322391Z [1669168960.922071] [08317a7e7676:31809:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7322662Z [1669168960.906862] [08317a7e7676:31808:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7322891Z [1669168960.920927] [08317a7e7676:31808:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7323131Z [1669168960.920927] [08317a7e7676:31808:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7323215Z ok (6.163s) 2022-11-23T02:12:26.7323234Z 2022-11-23T02:12:26.7323553Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7323672Z Ran 1 test in 6.163s 2022-11-23T02:12:26.7323691Z 2022-11-23T02:12:26.7323783Z OK 2022-11-23T02:12:26.7323802Z 2022-11-23T02:12:26.7323924Z Generating XML reports... 2022-11-23T02:12:26.7324377Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020235.xml 2022-11-23T02:12:26.7324755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7324933Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7325362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7325555Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7325574Z 2022-11-23T02:12:26.7325681Z Running tests... 2022-11-23T02:12:26.7325947Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7326266Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7326545Z test_ddp_uneven_input_join_disable (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7327295Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78684 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.650s) 2022-11-23T02:12:26.7327319Z 2022-11-23T02:12:26.7327582Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7327693Z Ran 1 test in 1.650s 2022-11-23T02:12:26.7327712Z 2022-11-23T02:12:26.7327820Z OK (skipped=1) 2022-11-23T02:12:26.7327839Z 2022-11-23T02:12:26.7327944Z Generating XML reports... 2022-11-23T02:12:26.7328395Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020244.xml 2022-11-23T02:12:26.7328770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7328944Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7329328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7329515Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7329535Z 2022-11-23T02:12:26.7329647Z Running tests... 2022-11-23T02:12:26.7329910Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7330206Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7330479Z test_ddp_uneven_inputs (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7331234Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/75648 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.659s) 2022-11-23T02:12:26.7331254Z 2022-11-23T02:12:26.7331517Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7331626Z Ran 1 test in 1.659s 2022-11-23T02:12:26.7331645Z 2022-11-23T02:12:26.7331748Z OK (skipped=1) 2022-11-23T02:12:26.7331767Z 2022-11-23T02:12:26.7331896Z Generating XML reports... 2022-11-23T02:12:26.7332346Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020248.xml 2022-11-23T02:12:26.7332719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7332945Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7333317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7333511Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7333531Z 2022-11-23T02:12:26.7333637Z Running tests... 2022-11-23T02:12:26.7333898Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7334210Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7334503Z test_ddp_uneven_inputs_stop_iteration_sync_bn (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7335309Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78113 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.632s) 2022-11-23T02:12:26.7335329Z 2022-11-23T02:12:26.7335593Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7335703Z Ran 1 test in 1.632s 2022-11-23T02:12:26.7335722Z 2022-11-23T02:12:26.7335827Z OK (skipped=1) 2022-11-23T02:12:26.7335846Z 2022-11-23T02:12:26.7335949Z Generating XML reports... 2022-11-23T02:12:26.7336397Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020252.xml 2022-11-23T02:12:26.7336773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7336952Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7337336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7337532Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7337552Z 2022-11-23T02:12:26.7337659Z Running tests... 2022-11-23T02:12:26.7337921Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7338217Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7338519Z test_ddp_unused_params_rebuild_buckets_exception (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7338741Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32024 2022-11-23T02:12:26.7338962Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32025 2022-11-23T02:12:26.7339344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7339519Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7339896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7340073Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7340451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7340626Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7341006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7341196Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7341448Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7341698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7342146Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7342560Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7342793Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7343025Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7343267Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp53biddxv 2022-11-23T02:12:26.7343541Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp53biddxv/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7343844Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5_9a0yig 2022-11-23T02:12:26.7344107Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5_9a0yig/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7344390Z [1669168982.130646] [08317a7e7676:32025:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7344621Z [1669168982.144513] [08317a7e7676:32025:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7344862Z [1669168982.144513] [08317a7e7676:32025:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7345136Z [1669168982.130462] [08317a7e7676:32024:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7345364Z [1669168982.144536] [08317a7e7676:32024:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7345587Z [1669168982.144536] [08317a7e7676:32024:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7345687Z ok (6.552s) 2022-11-23T02:12:26.7345709Z 2022-11-23T02:12:26.7345978Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7346089Z Ran 1 test in 6.552s 2022-11-23T02:12:26.7346109Z 2022-11-23T02:12:26.7346202Z OK 2022-11-23T02:12:26.7346221Z 2022-11-23T02:12:26.7346344Z Generating XML reports... 2022-11-23T02:12:26.7346799Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020256.xml 2022-11-23T02:12:26.7347174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7347350Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7347723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7347914Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7347933Z 2022-11-23T02:12:26.7348044Z Running tests... 2022-11-23T02:12:26.7348308Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7348625Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7348897Z test_ddp_zero_output_features (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7349338Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32142 2022-11-23T02:12:26.7349564Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32143 2022-11-23T02:12:26.7349927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7350110Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7350492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7350755Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7351137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7351313Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7351696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7351889Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7352136Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7352366Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7352840Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7353246Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7353479Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7353864Z /opt/conda/lib/python3.10/site-packages/torch/nn/init.py:405: UserWarning: Initializing zero-element tensors is a no-op 2022-11-23T02:12:26.7354118Z warnings.warn("Initializing zero-element tensors is a no-op") 2022-11-23T02:12:26.7354346Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7354719Z /opt/conda/lib/python3.10/site-packages/torch/nn/init.py:405: UserWarning: Initializing zero-element tensors is a no-op 2022-11-23T02:12:26.7354981Z warnings.warn("Initializing zero-element tensors is a no-op") 2022-11-23T02:12:26.7355223Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprr7j4sam 2022-11-23T02:12:26.7355502Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprr7j4sam/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7355759Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2se9aa5t 2022-11-23T02:12:26.7356029Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2se9aa5t/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7356305Z [1669168991.388507] [08317a7e7676:32143:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7356538Z [1669168991.402042] [08317a7e7676:32143:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7356779Z [1669168991.402042] [08317a7e7676:32143:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7357054Z [1669168991.385989] [08317a7e7676:32142:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7357282Z [1669168991.400069] [08317a7e7676:32142:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7357517Z [1669168991.400069] [08317a7e7676:32142:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7357602Z ok (6.263s) 2022-11-23T02:12:26.7357623Z 2022-11-23T02:12:26.7357898Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7358009Z Ran 1 test in 6.263s 2022-11-23T02:12:26.7358028Z 2022-11-23T02:12:26.7358119Z OK 2022-11-23T02:12:26.7358138Z 2022-11-23T02:12:26.7358266Z Generating XML reports... 2022-11-23T02:12:26.7358716Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020305.xml 2022-11-23T02:12:26.7359093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7359367Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7359745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7359940Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7359959Z 2022-11-23T02:12:26.7360065Z Running tests... 2022-11-23T02:12:26.7360327Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7360639Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7360902Z test_destroy_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7361187Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32256 2022-11-23T02:12:26.7361406Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32257 2022-11-23T02:12:26.7361770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7361950Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7362334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7362524Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7362888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7363063Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7363442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7363638Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7363890Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7364122Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7364528Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7364932Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7365163Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7365409Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.7365631Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7365871Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.7366275Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7366673Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7366757Z ok (4.267s) 2022-11-23T02:12:26.7366776Z 2022-11-23T02:12:26.7367044Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7367156Z Ran 1 test in 4.267s 2022-11-23T02:12:26.7367175Z 2022-11-23T02:12:26.7367268Z OK 2022-11-23T02:12:26.7367286Z 2022-11-23T02:12:26.7367410Z Generating XML reports... 2022-11-23T02:12:26.7367861Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020314.xml 2022-11-23T02:12:26.7368237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7368414Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7368843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7369026Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7369046Z 2022-11-23T02:12:26.7369155Z Running tests... 2022-11-23T02:12:26.7369421Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7369737Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7369995Z test_destroy_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7370219Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32359 2022-11-23T02:12:26.7370490Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32360 2022-11-23T02:12:26.7370873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7371036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7371422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7371611Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7371986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7372160Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7372535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7372724Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7372973Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7373217Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7373608Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7374011Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7374241Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7374485Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.7374714Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7374962Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.7375364Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7375818Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7375920Z ok (4.380s) 2022-11-23T02:12:26.7375940Z 2022-11-23T02:12:26.7376185Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7376295Z Ran 1 test in 4.380s 2022-11-23T02:12:26.7376314Z 2022-11-23T02:12:26.7376402Z OK 2022-11-23T02:12:26.7376421Z 2022-11-23T02:12:26.7376543Z Generating XML reports... 2022-11-23T02:12:26.7376992Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020321.xml 2022-11-23T02:12:26.7377365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7377545Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7377929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7378153Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7378192Z 2022-11-23T02:12:26.7378286Z Running tests... 2022-11-23T02:12:26.7378546Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7378861Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7379141Z test_detect_ddp_is_actually_static (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7379896Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78767 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.643s) 2022-11-23T02:12:26.7379960Z 2022-11-23T02:12:26.7380227Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7380338Z Ran 1 test in 1.643s 2022-11-23T02:12:26.7380361Z 2022-11-23T02:12:26.7380468Z OK (skipped=1) 2022-11-23T02:12:26.7380487Z 2022-11-23T02:12:26.7380610Z Generating XML reports... 2022-11-23T02:12:26.7381040Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020328.xml 2022-11-23T02:12:26.7381415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7381588Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7381972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7382168Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7382188Z 2022-11-23T02:12:26.7382295Z Running tests... 2022-11-23T02:12:26.7382557Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7382874Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7383136Z test_different_graph_across_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7383884Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78748 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.654s) 2022-11-23T02:12:26.7383922Z 2022-11-23T02:12:26.7384163Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7384276Z Ran 1 test in 1.654s 2022-11-23T02:12:26.7384295Z 2022-11-23T02:12:26.7384396Z OK (skipped=1) 2022-11-23T02:12:26.7384414Z 2022-11-23T02:12:26.7384536Z Generating XML reports... 2022-11-23T02:12:26.7384987Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020332.xml 2022-11-23T02:12:26.7385362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7385540Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7385921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7386096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7386134Z 2022-11-23T02:12:26.7386224Z Running tests... 2022-11-23T02:12:26.7386487Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7386805Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7387077Z test_dump_DDP_relevant_env_vars (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7387345Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32530 2022-11-23T02:12:26.7387574Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32531 2022-11-23T02:12:26.7387951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7388126Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7388493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7388686Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7389267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7389531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7389953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7390151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7390400Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7390649Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7391036Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7391437Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7391668Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7391902Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7392002Z ok (4.237s) 2022-11-23T02:12:26.7392022Z 2022-11-23T02:12:26.7392294Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7392404Z Ran 1 test in 4.237s 2022-11-23T02:12:26.7392423Z 2022-11-23T02:12:26.7392515Z OK 2022-11-23T02:12:26.7392534Z 2022-11-23T02:12:26.7392657Z Generating XML reports... 2022-11-23T02:12:26.7393093Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020336.xml 2022-11-23T02:12:26.7393468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7393643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7394024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7394222Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7394242Z 2022-11-23T02:12:26.7394349Z Running tests... 2022-11-23T02:12:26.7394613Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7394929Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7395170Z test_gather (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:12:26.7395204Z 2022-11-23T02:12:26.7395449Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7395558Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7395577Z 2022-11-23T02:12:26.7395683Z OK (skipped=1) 2022-11-23T02:12:26.7395702Z 2022-11-23T02:12:26.7395824Z Generating XML reports... 2022-11-23T02:12:26.7396273Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020343.xml 2022-11-23T02:12:26.7396653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7396829Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7397277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7397461Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7397498Z 2022-11-23T02:12:26.7397589Z Running tests... 2022-11-23T02:12:26.7397855Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7398167Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7398437Z test_gather_checks (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:12:26.7398503Z 2022-11-23T02:12:26.7398773Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7398885Z Ran 1 test in 0.003s 2022-11-23T02:12:26.7398904Z 2022-11-23T02:12:26.7399010Z OK (skipped=1) 2022-11-23T02:12:26.7399028Z 2022-11-23T02:12:26.7399150Z Generating XML reports... 2022-11-23T02:12:26.7399587Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020346.xml 2022-11-23T02:12:26.7399962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7400136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7400516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7400707Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7400726Z 2022-11-23T02:12:26.7400836Z Running tests... 2022-11-23T02:12:26.7401099Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7401409Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7401650Z test_gather_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA gather (0.002s) 2022-11-23T02:12:26.7401687Z 2022-11-23T02:12:26.7401932Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7402042Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7402061Z 2022-11-23T02:12:26.7402164Z OK (skipped=1) 2022-11-23T02:12:26.7402183Z 2022-11-23T02:12:26.7402306Z Generating XML reports... 2022-11-23T02:12:26.7402754Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020348.xml 2022-11-23T02:12:26.7403127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7403307Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7403691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7403868Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7403904Z 2022-11-23T02:12:26.7403993Z Running tests... 2022-11-23T02:12:26.7404253Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7404566Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7404835Z test_gather_full_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:12:26.7404855Z 2022-11-23T02:12:26.7405113Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7405223Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7405242Z 2022-11-23T02:12:26.7405348Z OK (skipped=1) 2022-11-23T02:12:26.7405367Z 2022-11-23T02:12:26.7405492Z Generating XML reports... 2022-11-23T02:12:26.7405919Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020350.xml 2022-11-23T02:12:26.7406334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7406515Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7406902Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7407093Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7407113Z 2022-11-23T02:12:26.7407221Z Running tests... 2022-11-23T02:12:26.7407483Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7407795Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7408115Z test_gather_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:12:26.7408135Z 2022-11-23T02:12:26.7408380Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7408497Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7408516Z 2022-11-23T02:12:26.7408624Z OK (skipped=1) 2022-11-23T02:12:26.7408642Z 2022-11-23T02:12:26.7408764Z Generating XML reports... 2022-11-23T02:12:26.7409208Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020353.xml 2022-11-23T02:12:26.7409579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7409754Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7410136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7410314Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7410350Z 2022-11-23T02:12:26.7410439Z Running tests... 2022-11-23T02:12:26.7410702Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7411017Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7411285Z test_gather_object (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:12:26.7411305Z 2022-11-23T02:12:26.7411565Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7411673Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7411692Z 2022-11-23T02:12:26.7411798Z OK (skipped=1) 2022-11-23T02:12:26.7411816Z 2022-11-23T02:12:26.7411937Z Generating XML reports... 2022-11-23T02:12:26.7412368Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020355.xml 2022-11-23T02:12:26.7412751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7412928Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7413312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7413501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7413520Z 2022-11-23T02:12:26.7413625Z Running tests... 2022-11-23T02:12:26.7413889Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7414202Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7414483Z test_gather_object_subgroup (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:12:26.7414503Z 2022-11-23T02:12:26.7414744Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7414861Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7414881Z 2022-11-23T02:12:26.7414986Z OK (skipped=1) 2022-11-23T02:12:26.7415005Z 2022-11-23T02:12:26.7415123Z Generating XML reports... 2022-11-23T02:12:26.7415621Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020358.xml 2022-11-23T02:12:26.7416005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7416180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7416565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7416757Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7416776Z 2022-11-23T02:12:26.7416864Z Running tests... 2022-11-23T02:12:26.7417178Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7417490Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7417743Z test_get_backend (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7417968Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 32864 2022-11-23T02:12:26.7418189Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 32865 2022-11-23T02:12:26.7418562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7418737Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7419101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7419292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7419667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7419840Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7420225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7420415Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7420662Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7420907Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7421314Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7421700Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7421938Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7422179Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.7422409Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7422653Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.7423056Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7423456Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7423556Z ok (4.494s) 2022-11-23T02:12:26.7423576Z 2022-11-23T02:12:26.7423838Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7423936Z Ran 1 test in 4.494s 2022-11-23T02:12:26.7423956Z 2022-11-23T02:12:26.7424044Z OK 2022-11-23T02:12:26.7424063Z 2022-11-23T02:12:26.7424187Z Generating XML reports... 2022-11-23T02:12:26.7424639Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020400.xml 2022-11-23T02:12:26.7425061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7425245Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7425630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7425825Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7425844Z 2022-11-23T02:12:26.7425934Z Running tests... 2022-11-23T02:12:26.7426196Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7426573Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7426854Z test_get_future (__main__.TestDistBackendWithSpawn) ... skip: get_future is only supported on mpi, nccl and gloo (0.002s) 2022-11-23T02:12:26.7426874Z 2022-11-23T02:12:26.7427140Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7427250Z Ran 1 test in 0.003s 2022-11-23T02:12:26.7427269Z 2022-11-23T02:12:26.7427375Z OK (skipped=1) 2022-11-23T02:12:26.7427394Z 2022-11-23T02:12:26.7427518Z Generating XML reports... 2022-11-23T02:12:26.7427960Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020407.xml 2022-11-23T02:12:26.7428315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7428489Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7428873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7429286Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7429308Z 2022-11-23T02:12:26.7429417Z Running tests... 2022-11-23T02:12:26.7429693Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7430011Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7430258Z test_get_rank (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7430463Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33000 2022-11-23T02:12:26.7430683Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33001 2022-11-23T02:12:26.7431061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7431238Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7431626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7431817Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7432190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7432362Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7432738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7432913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7433164Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7433410Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7433818Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7434218Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7434520Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7434765Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7434867Z ok (4.446s) 2022-11-23T02:12:26.7434886Z 2022-11-23T02:12:26.7435156Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7435250Z Ran 1 test in 4.446s 2022-11-23T02:12:26.7435269Z 2022-11-23T02:12:26.7435362Z OK 2022-11-23T02:12:26.7435380Z 2022-11-23T02:12:26.7435504Z Generating XML reports... 2022-11-23T02:12:26.7435952Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020409.xml 2022-11-23T02:12:26.7436396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7436573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7436961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7437153Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7437172Z 2022-11-23T02:12:26.7437261Z Running tests... 2022-11-23T02:12:26.7437523Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7437832Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7438102Z test_get_rank_size_full_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7438322Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33103 2022-11-23T02:12:26.7438544Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33104 2022-11-23T02:12:26.7438917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7439094Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7439474Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7439646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7440012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7440184Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7440559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7440749Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7440998Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7441246Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7441650Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7442032Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7442269Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7442515Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.7442743Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7442989Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.7443382Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7443831Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7443939Z ok (4.354s) 2022-11-23T02:12:26.7443959Z 2022-11-23T02:12:26.7444222Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7444314Z Ran 1 test in 4.354s 2022-11-23T02:12:26.7444350Z 2022-11-23T02:12:26.7444425Z OK 2022-11-23T02:12:26.7444443Z 2022-11-23T02:12:26.7444564Z Generating XML reports... 2022-11-23T02:12:26.7445016Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020416.xml 2022-11-23T02:12:26.7445443Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7445621Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7446008Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7446203Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7446222Z 2022-11-23T02:12:26.7446328Z Running tests... 2022-11-23T02:12:26.7446573Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7446887Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7447159Z test_get_rank_size_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7447377Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33206 2022-11-23T02:12:26.7447601Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33207 2022-11-23T02:12:26.7447974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7448152Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7448538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7448711Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7449078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7449251Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7449625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7449814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7450067Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7450313Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7450720Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7451122Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7451337Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7451582Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.7451806Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7452045Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.7452453Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7452892Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7452999Z ok (4.353s) 2022-11-23T02:12:26.7453019Z 2022-11-23T02:12:26.7453288Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7453402Z Ran 1 test in 4.353s 2022-11-23T02:12:26.7453422Z 2022-11-23T02:12:26.7453495Z OK 2022-11-23T02:12:26.7453514Z 2022-11-23T02:12:26.7453637Z Generating XML reports... 2022-11-23T02:12:26.7454088Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020423.xml 2022-11-23T02:12:26.7454463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7454687Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7455072Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7455267Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7455287Z 2022-11-23T02:12:26.7455396Z Running tests... 2022-11-23T02:12:26.7455642Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7455956Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7456222Z test_invalid_static_graph (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7456444Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33309 2022-11-23T02:12:26.7456662Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33310 2022-11-23T02:12:26.7457039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7457213Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7457599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7457791Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7458141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7458315Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7458690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7458880Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7459128Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7459378Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7459787Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7460189Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7460421Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7460636Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7460895Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuj50_iv0 2022-11-23T02:12:26.7461163Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuj50_iv0/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7461423Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3wf9_mtr 2022-11-23T02:12:26.7461691Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3wf9_mtr/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7462017Z [1669169075.977451] [08317a7e7676:33309:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7462259Z [1669169075.991205] [08317a7e7676:33309:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7462497Z [1669169075.991205] [08317a7e7676:33309:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7462771Z [1669169075.983640] [08317a7e7676:33310:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7462982Z [1669169075.996804] [08317a7e7676:33310:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7463266Z [1669169075.996804] [08317a7e7676:33310:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7463367Z ok (6.628s) 2022-11-23T02:12:26.7463386Z 2022-11-23T02:12:26.7463660Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7463773Z Ran 1 test in 6.628s 2022-11-23T02:12:26.7463792Z 2022-11-23T02:12:26.7463882Z OK 2022-11-23T02:12:26.7463901Z 2022-11-23T02:12:26.7464025Z Generating XML reports... 2022-11-23T02:12:26.7464474Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020430.xml 2022-11-23T02:12:26.7464848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7465006Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7465397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7465591Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7465610Z 2022-11-23T02:12:26.7465717Z Running tests... 2022-11-23T02:12:26.7465986Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7466300Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7466543Z test_irecv (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7466767Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33427 2022-11-23T02:12:26.7466969Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33428 2022-11-23T02:12:26.7467347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7467527Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7467910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7468097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7468466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7468637Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7469220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7469421Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7469657Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7469908Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7470321Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7470790Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7471030Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7471260Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7471537Z [1669169083.769599] [08317a7e7676:33428:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7471771Z [1669169085.198515] [08317a7e7676:33428:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7472007Z [1669169085.198515] [08317a7e7676:33428:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7472319Z [1669169083.762471] [08317a7e7676:33427:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7472553Z [1669169085.181477] [08317a7e7676:33427:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7472788Z [1669169085.181477] [08317a7e7676:33427:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7472891Z ok (6.250s) 2022-11-23T02:12:26.7472910Z 2022-11-23T02:12:26.7473177Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7473289Z Ran 1 test in 6.251s 2022-11-23T02:12:26.7473308Z 2022-11-23T02:12:26.7473401Z OK 2022-11-23T02:12:26.7473420Z 2022-11-23T02:12:26.7473545Z Generating XML reports... 2022-11-23T02:12:26.7473995Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020439.xml 2022-11-23T02:12:26.7474359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7474534Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7474923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7475115Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7475134Z 2022-11-23T02:12:26.7475240Z Running tests... 2022-11-23T02:12:26.7475503Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7475818Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7476060Z test_isend (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7476283Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33537 2022-11-23T02:12:26.7476490Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33538 2022-11-23T02:12:26.7476866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7477043Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7477424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7477614Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7477982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7478155Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7478536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7478711Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7478960Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7479250Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7479665Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7480064Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7480297Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7480528Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7480805Z [1669169092.587381] [08317a7e7676:33538:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7481085Z [1669169094.007092] [08317a7e7676:33538:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7481325Z [1669169094.007092] [08317a7e7676:33538:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7481582Z [1669169092.585792] [08317a7e7676:33537:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7481816Z [1669169094.036254] [08317a7e7676:33537:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7482053Z [1669169094.036254] [08317a7e7676:33537:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7482154Z ok (6.157s) 2022-11-23T02:12:26.7482174Z 2022-11-23T02:12:26.7482444Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7482559Z Ran 1 test in 6.157s 2022-11-23T02:12:26.7482579Z 2022-11-23T02:12:26.7482671Z OK 2022-11-23T02:12:26.7482690Z 2022-11-23T02:12:26.7482814Z Generating XML reports... 2022-11-23T02:12:26.7483267Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020448.xml 2022-11-23T02:12:26.7483627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7483804Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7484185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7484376Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7484396Z 2022-11-23T02:12:26.7484503Z Running tests... 2022-11-23T02:12:26.7484769Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7485084Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7485360Z test_isend_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7485569Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33647 2022-11-23T02:12:26.7485790Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33648 2022-11-23T02:12:26.7486160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7486336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7486714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7486906Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7487275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7487450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7487878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7488058Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7488306Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7488556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7488957Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7489356Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7489647Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7490035Z STAGE:2022-11-23 02:05:01 33648:33648 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7490268Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7490590Z STAGE:2022-11-23 02:05:01 33647:33647 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7490866Z [1669169101.273523] [08317a7e7676:33648:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7491099Z [1669169102.901533] [08317a7e7676:33648:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7491336Z [1669169102.901533] [08317a7e7676:33648:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7491681Z STAGE:2022-11-23 02:05:03 33648:33648 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7492034Z STAGE:2022-11-23 02:05:03 33648:33648 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7492312Z [1669169101.251788] [08317a7e7676:33647:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7492539Z [1669169102.913535] [08317a7e7676:33647:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7492768Z [1669169102.913535] [08317a7e7676:33647:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7493108Z STAGE:2022-11-23 02:05:03 33647:33647 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7493442Z STAGE:2022-11-23 02:05:03 33647:33647 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7493547Z ok (6.722s) 2022-11-23T02:12:26.7493566Z 2022-11-23T02:12:26.7493830Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7493940Z Ran 1 test in 6.722s 2022-11-23T02:12:26.7493960Z 2022-11-23T02:12:26.7494053Z OK 2022-11-23T02:12:26.7494074Z 2022-11-23T02:12:26.7494199Z Generating XML reports... 2022-11-23T02:12:26.7494648Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020457.xml 2022-11-23T02:12:26.7495023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7495199Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7495565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7495758Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7495780Z 2022-11-23T02:12:26.7495887Z Running tests... 2022-11-23T02:12:26.7496148Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7496461Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7496788Z test_isend_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7497018Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33761 2022-11-23T02:12:26.7497239Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33762 2022-11-23T02:12:26.7497596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7497773Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7498155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7498447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7498817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7498993Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7499375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7499564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7499811Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7500040Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7500442Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7500848Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7501081Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7501422Z STAGE:2022-11-23 02:05:10 33762:33762 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7501653Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7501984Z STAGE:2022-11-23 02:05:10 33761:33761 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7502262Z [1669169110.618200] [08317a7e7676:33761:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7502495Z [1669169112.287558] [08317a7e7676:33761:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7502719Z [1669169112.287558] [08317a7e7676:33761:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7503068Z STAGE:2022-11-23 02:05:12 33761:33761 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7503347Z [1669169110.620886] [08317a7e7676:33762:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7503576Z [1669169112.265786] [08317a7e7676:33762:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7503808Z [1669169112.265786] [08317a7e7676:33762:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7504152Z STAGE:2022-11-23 02:05:12 33762:33762 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7504504Z STAGE:2022-11-23 02:05:12 33761:33761 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7504860Z STAGE:2022-11-23 02:05:12 33762:33762 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7504961Z ok (6.761s) 2022-11-23T02:12:26.7504980Z 2022-11-23T02:12:26.7505293Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7505392Z Ran 1 test in 6.761s 2022-11-23T02:12:26.7505411Z 2022-11-23T02:12:26.7505503Z OK 2022-11-23T02:12:26.7505521Z 2022-11-23T02:12:26.7505643Z Generating XML reports... 2022-11-23T02:12:26.7506100Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020506.xml 2022-11-23T02:12:26.7506478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7506656Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7507037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7507281Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7507301Z 2022-11-23T02:12:26.7507391Z Running tests... 2022-11-23T02:12:26.7507658Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7507977Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7508264Z test_monitored_barrier_allreduce_hang (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7508487Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33875 2022-11-23T02:12:26.7508706Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33876 2022-11-23T02:12:26.7509298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7509483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7509878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7510053Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7510423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7510597Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7510979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7511170Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7511422Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7511670Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7512078Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7512461Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7512703Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7512951Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.7513178Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7513414Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.7513815Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7514215Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7514459Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:12:26.7514772Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:12:26.7515179Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:12:26.7515562Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:12:26.7515795Z [E ProcessGroupGloo.cpp:137] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 100 ms 2022-11-23T02:12:26.7515897Z ok (23.192s) 2022-11-23T02:12:26.7515917Z 2022-11-23T02:12:26.7516183Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7516296Z Ran 1 test in 23.193s 2022-11-23T02:12:26.7516371Z 2022-11-23T02:12:26.7516467Z OK 2022-11-23T02:12:26.7516486Z 2022-11-23T02:12:26.7516612Z Generating XML reports... 2022-11-23T02:12:26.7517066Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020516.xml 2022-11-23T02:12:26.7517430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7517608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7517991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7518184Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7518203Z 2022-11-23T02:12:26.7518310Z Running tests... 2022-11-23T02:12:26.7518572Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7518886Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7519194Z test_monitored_barrier_allreduce_hang_wait_all_ranks (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7519413Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 33996 2022-11-23T02:12:26.7519620Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 33997 2022-11-23T02:12:26.7519994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7520169Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7520550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7520742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7521106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7521286Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7521662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7521836Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7522082Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7522331Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7522733Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7523132Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7523360Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7523607Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.7523834Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7524117Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.7524502Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7524898Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7525138Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:12:26.7525379Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:12:26.7525774Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:12:26.7526224Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:12:26.7526463Z [E ProcessGroupGloo.cpp:2802] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 100 ms 2022-11-23T02:12:26.7526693Z [E ProcessGroupGloo.cpp:137] [Rank 0]: Ranks 1 failed to pass monitoredBarrier in 100 ms 2022-11-23T02:12:26.7526790Z ok (22.499s) 2022-11-23T02:12:26.7526809Z 2022-11-23T02:12:26.7527057Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7527174Z Ran 1 test in 22.500s 2022-11-23T02:12:26.7527193Z 2022-11-23T02:12:26.7527284Z OK 2022-11-23T02:12:26.7527303Z 2022-11-23T02:12:26.7527425Z Generating XML reports... 2022-11-23T02:12:26.7527875Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020541.xml 2022-11-23T02:12:26.7528255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7528432Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7528821Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7529012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7529032Z 2022-11-23T02:12:26.7529122Z Running tests... 2022-11-23T02:12:26.7529382Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7529695Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7530111Z test_monitored_barrier_failure_order (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.7530131Z 2022-11-23T02:12:26.7530390Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7530504Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7530523Z 2022-11-23T02:12:26.7530630Z OK (skipped=1) 2022-11-23T02:12:26.7530648Z 2022-11-23T02:12:26.7530771Z Generating XML reports... 2022-11-23T02:12:26.7531222Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020606.xml 2022-11-23T02:12:26.7531580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7531758Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7532140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7532330Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7532349Z 2022-11-23T02:12:26.7532456Z Running tests... 2022-11-23T02:12:26.7532716Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7533035Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7533428Z test_monitored_barrier_gloo (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.003s) 2022-11-23T02:12:26.7533489Z 2022-11-23T02:12:26.7533754Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7533848Z Ran 1 test in 0.003s 2022-11-23T02:12:26.7533867Z 2022-11-23T02:12:26.7533973Z OK (skipped=1) 2022-11-23T02:12:26.7533992Z 2022-11-23T02:12:26.7534114Z Generating XML reports... 2022-11-23T02:12:26.7534565Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020609.xml 2022-11-23T02:12:26.7534939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7535116Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7535549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7535740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7535759Z 2022-11-23T02:12:26.7535852Z Running tests... 2022-11-23T02:12:26.7536120Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7536430Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7536853Z test_monitored_barrier_gloo_rank_0_timeout (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.7536873Z 2022-11-23T02:12:26.7537131Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7537239Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7537258Z 2022-11-23T02:12:26.7537364Z OK (skipped=1) 2022-11-23T02:12:26.7537386Z 2022-11-23T02:12:26.7537508Z Generating XML reports... 2022-11-23T02:12:26.7537956Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020611.xml 2022-11-23T02:12:26.7538316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7538489Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7538872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7539065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7539084Z 2022-11-23T02:12:26.7539190Z Running tests... 2022-11-23T02:12:26.7539451Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7539765Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7540182Z test_monitored_barrier_gloo_subgroup (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.7540203Z 2022-11-23T02:12:26.7540462Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7540554Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7540575Z 2022-11-23T02:12:26.7540681Z OK (skipped=1) 2022-11-23T02:12:26.7540699Z 2022-11-23T02:12:26.7540822Z Generating XML reports... 2022-11-23T02:12:26.7541273Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020614.xml 2022-11-23T02:12:26.7541650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7541824Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7542206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7542403Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7542422Z 2022-11-23T02:12:26.7542529Z Running tests... 2022-11-23T02:12:26.7542777Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7543129Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7543551Z test_monitored_barrier_wait_all_ranks (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.7543571Z 2022-11-23T02:12:26.7543830Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7543941Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7543959Z 2022-11-23T02:12:26.7544066Z OK (skipped=1) 2022-11-23T02:12:26.7544085Z 2022-11-23T02:12:26.7544209Z Generating XML reports... 2022-11-23T02:12:26.7544654Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020616.xml 2022-11-23T02:12:26.7545062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7545238Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7545626Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7545816Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7545836Z 2022-11-23T02:12:26.7545941Z Running tests... 2022-11-23T02:12:26.7546203Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7546511Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7546916Z test_nccl_backend_bool_allgather (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.002s) 2022-11-23T02:12:26.7546935Z 2022-11-23T02:12:26.7547200Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7547291Z Ran 1 test in 0.003s 2022-11-23T02:12:26.7547311Z 2022-11-23T02:12:26.7547416Z OK (skipped=1) 2022-11-23T02:12:26.7547434Z 2022-11-23T02:12:26.7547556Z Generating XML reports... 2022-11-23T02:12:26.7548006Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020618.xml 2022-11-23T02:12:26.7548379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7548557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7549149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7549355Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7549375Z 2022-11-23T02:12:26.7549483Z Running tests... 2022-11-23T02:12:26.7549734Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7550048Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7550456Z test_nccl_backend_bool_allreduce (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.002s) 2022-11-23T02:12:26.7550476Z 2022-11-23T02:12:26.7550735Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7550845Z Ran 1 test in 0.003s 2022-11-23T02:12:26.7550867Z 2022-11-23T02:12:26.7550976Z OK (skipped=1) 2022-11-23T02:12:26.7550996Z 2022-11-23T02:12:26.7551114Z Generating XML reports... 2022-11-23T02:12:26.7551562Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020621.xml 2022-11-23T02:12:26.7551937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7552098Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7552476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7552736Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7552758Z 2022-11-23T02:12:26.7552870Z Running tests... 2022-11-23T02:12:26.7553129Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7553442Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7553840Z test_nccl_backend_bool_broadcast (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.002s) 2022-11-23T02:12:26.7553859Z 2022-11-23T02:12:26.7554114Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7554207Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7554241Z 2022-11-23T02:12:26.7554403Z OK (skipped=1) 2022-11-23T02:12:26.7554422Z 2022-11-23T02:12:26.7554544Z Generating XML reports... 2022-11-23T02:12:26.7554994Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020623.xml 2022-11-23T02:12:26.7555373Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7555547Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7555931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7556121Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7556140Z 2022-11-23T02:12:26.7556244Z Running tests... 2022-11-23T02:12:26.7556491Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7556801Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7557205Z test_nccl_backend_bool_reduce (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'nccl'} (0.003s) 2022-11-23T02:12:26.7557224Z 2022-11-23T02:12:26.7557485Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7557598Z Ran 1 test in 0.003s 2022-11-23T02:12:26.7557616Z 2022-11-23T02:12:26.7557724Z OK (skipped=1) 2022-11-23T02:12:26.7557743Z 2022-11-23T02:12:26.7557863Z Generating XML reports... 2022-11-23T02:12:26.7558304Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020626.xml 2022-11-23T02:12:26.7558678Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7558837Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7559212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7559406Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7559425Z 2022-11-23T02:12:26.7559532Z Running tests... 2022-11-23T02:12:26.7559791Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7560107Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7560414Z test_nccl_high_priority_stream (__main__.TestDistBackendWithSpawn) ... skip: Only NCCL backend supports high priority stream (0.002s) 2022-11-23T02:12:26.7560434Z 2022-11-23T02:12:26.7560696Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7560806Z Ran 1 test in 0.003s 2022-11-23T02:12:26.7560825Z 2022-11-23T02:12:26.7560913Z OK (skipped=1) 2022-11-23T02:12:26.7560931Z 2022-11-23T02:12:26.7561053Z Generating XML reports... 2022-11-23T02:12:26.7561504Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020628.xml 2022-11-23T02:12:26.7561880Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7562054Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7562480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7562683Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7562702Z 2022-11-23T02:12:26.7562808Z Running tests... 2022-11-23T02:12:26.7563054Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7563369Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7563623Z test_new_subgroups (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T02:12:26.7563684Z 2022-11-23T02:12:26.7563950Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7564060Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7564079Z 2022-11-23T02:12:26.7564184Z OK (skipped=1) 2022-11-23T02:12:26.7564203Z 2022-11-23T02:12:26.7564324Z Generating XML reports... 2022-11-23T02:12:26.7564773Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020630.xml 2022-11-23T02:12:26.7565145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7565303Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7565682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7565871Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7565890Z 2022-11-23T02:12:26.7565996Z Running tests... 2022-11-23T02:12:26.7566262Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7566575Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7566851Z test_new_subgroups_by_enumeration (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T02:12:26.7566870Z 2022-11-23T02:12:26.7567132Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7567240Z Ran 1 test in 0.003s 2022-11-23T02:12:26.7567259Z 2022-11-23T02:12:26.7567347Z OK (skipped=1) 2022-11-23T02:12:26.7567366Z 2022-11-23T02:12:26.7567488Z Generating XML reports... 2022-11-23T02:12:26.7567934Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020633.xml 2022-11-23T02:12:26.7568310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7568490Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7568873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7569069Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7569089Z 2022-11-23T02:12:26.7569198Z Running tests... 2022-11-23T02:12:26.7569457Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7569756Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7570071Z test_new_subgroups_by_enumeration_input_rank_exceeds_world_size (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T02:12:26.7570091Z 2022-11-23T02:12:26.7570348Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7570459Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7570477Z 2022-11-23T02:12:26.7570586Z OK (skipped=1) 2022-11-23T02:12:26.7570605Z 2022-11-23T02:12:26.7570726Z Generating XML reports... 2022-11-23T02:12:26.7571169Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020635.xml 2022-11-23T02:12:26.7571590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7571773Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7572140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7572333Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7572352Z 2022-11-23T02:12:26.7572458Z Running tests... 2022-11-23T02:12:26.7572721Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7573034Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7573386Z test_new_subgroups_by_enumeration_negative_input_rank (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7573607Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34546 2022-11-23T02:12:26.7573832Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34547 2022-11-23T02:12:26.7574193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7574366Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7574734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7574908Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7575291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7575488Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7575867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7576062Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7576307Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7576538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7576940Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7577340Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7577569Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7577804Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7577904Z ok (4.255s) 2022-11-23T02:12:26.7577923Z 2022-11-23T02:12:26.7578183Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7578296Z Ran 1 test in 4.255s 2022-11-23T02:12:26.7578316Z 2022-11-23T02:12:26.7578389Z OK 2022-11-23T02:12:26.7578424Z 2022-11-23T02:12:26.7578531Z Generating XML reports... 2022-11-23T02:12:26.7578982Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020638.xml 2022-11-23T02:12:26.7579354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7579529Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7579910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7580107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7580126Z 2022-11-23T02:12:26.7580233Z Running tests... 2022-11-23T02:12:26.7580498Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7580838Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7581145Z test_new_subgroups_group_size_exceeds_world_size (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7581366Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34649 2022-11-23T02:12:26.7581583Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34650 2022-11-23T02:12:26.7581958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7582131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7582571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7582763Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7583118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7583292Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7583669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7583858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7584105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7584351Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7584759Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7585161Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7585394Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7585600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7585702Z ok (4.256s) 2022-11-23T02:12:26.7585722Z 2022-11-23T02:12:26.7585989Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7586100Z Ran 1 test in 4.257s 2022-11-23T02:12:26.7586119Z 2022-11-23T02:12:26.7586211Z OK 2022-11-23T02:12:26.7586230Z 2022-11-23T02:12:26.7586351Z Generating XML reports... 2022-11-23T02:12:26.7586799Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020644.xml 2022-11-23T02:12:26.7587177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7587354Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7587721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7587913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7587932Z 2022-11-23T02:12:26.7588043Z Running tests... 2022-11-23T02:12:26.7588303Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7588617Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7588896Z test_new_subgroups_overlap_not_allowed (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T02:12:26.7588916Z 2022-11-23T02:12:26.7589411Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7589523Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7589543Z 2022-11-23T02:12:26.7589634Z OK (skipped=1) 2022-11-23T02:12:26.7589672Z 2022-11-23T02:12:26.7589807Z Generating XML reports... 2022-11-23T02:12:26.7590331Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020651.xml 2022-11-23T02:12:26.7590718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7590891Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7591274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7591469Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7591488Z 2022-11-23T02:12:26.7591598Z Running tests... 2022-11-23T02:12:26.7591923Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7592219Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7592529Z test_new_subgroups_world_size_not_divisible_by_group_size (__main__.TestDistBackendWithSpawn) ... skip: Test requires world size of 4 (0.002s) 2022-11-23T02:12:26.7592549Z 2022-11-23T02:12:26.7592808Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7592920Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7592939Z 2022-11-23T02:12:26.7593044Z OK (skipped=1) 2022-11-23T02:12:26.7593063Z 2022-11-23T02:12:26.7593184Z Generating XML reports... 2022-11-23T02:12:26.7593630Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020654.xml 2022-11-23T02:12:26.7594006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7594184Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7594548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7594740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7594760Z 2022-11-23T02:12:26.7594866Z Running tests... 2022-11-23T02:12:26.7595129Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7595443Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7595727Z test_output_unused_in_loss_dict_module (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7596481Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/78112 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.647s) 2022-11-23T02:12:26.7596505Z 2022-11-23T02:12:26.7596766Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7596876Z Ran 1 test in 1.647s 2022-11-23T02:12:26.7596895Z 2022-11-23T02:12:26.7597004Z OK (skipped=1) 2022-11-23T02:12:26.7597023Z 2022-11-23T02:12:26.7597128Z Generating XML reports... 2022-11-23T02:12:26.7597577Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020656.xml 2022-11-23T02:12:26.7597951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7598127Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7598510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7598704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7598726Z 2022-11-23T02:12:26.7598834Z Running tests... 2022-11-23T02:12:26.7599098Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7599441Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7599735Z test_output_unused_in_loss_tuple_module (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7599957Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34852 2022-11-23T02:12:26.7600175Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34853 2022-11-23T02:12:26.7600549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7600723Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7601105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7601342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7601717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7601877Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7602257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7602448Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7602694Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7602942Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7603348Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7603751Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7603986Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7604221Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7604463Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjkjfjac3 2022-11-23T02:12:26.7604738Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjkjfjac3/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7604995Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp6qyg0m7 2022-11-23T02:12:26.7605267Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp6qyg0m7/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7605544Z [1669169226.081084] [08317a7e7676:34852:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7605782Z [1669169226.094697] [08317a7e7676:34852:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7606023Z [1669169226.094697] [08317a7e7676:34852:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7606292Z [1669169226.089189] [08317a7e7676:34853:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7606520Z [1669169226.102561] [08317a7e7676:34853:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7606741Z [1669169226.102561] [08317a7e7676:34853:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7606845Z ok (6.643s) 2022-11-23T02:12:26.7606864Z 2022-11-23T02:12:26.7607142Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7607255Z Ran 1 test in 6.643s 2022-11-23T02:12:26.7607274Z 2022-11-23T02:12:26.7607365Z OK 2022-11-23T02:12:26.7607384Z 2022-11-23T02:12:26.7607507Z Generating XML reports... 2022-11-23T02:12:26.7608001Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020700.xml 2022-11-23T02:12:26.7608388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7608568Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7608938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7609132Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7609151Z 2022-11-23T02:12:26.7609258Z Running tests... 2022-11-23T02:12:26.7609582Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7609893Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7610172Z test_periodic_model_averager (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7610394Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 34970 2022-11-23T02:12:26.7610615Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 34971 2022-11-23T02:12:26.7610969Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7611147Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7611532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7611723Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7612091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7612262Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7612642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7612833Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7613085Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7613316Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7613718Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7614117Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7614353Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7614583Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7614863Z [1669169236.283977] [08317a7e7676:34970:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7615138Z [1669169236.289680] [08317a7e7676:34971:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7615368Z [1669169236.297840] [08317a7e7676:34970:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7615606Z [1669169236.297840] [08317a7e7676:34970:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7615839Z [1669169236.302489] [08317a7e7676:34971:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7616052Z [1669169236.302489] [08317a7e7676:34971:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7616196Z ok (7.262s) 2022-11-23T02:12:26.7616218Z 2022-11-23T02:12:26.7616575Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7616725Z Ran 1 test in 7.262s 2022-11-23T02:12:26.7616745Z 2022-11-23T02:12:26.7616918Z OK 2022-11-23T02:12:26.7616938Z 2022-11-23T02:12:26.7617098Z Generating XML reports... 2022-11-23T02:12:26.7617569Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020710.xml 2022-11-23T02:12:26.7618105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7636234Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7636864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7637051Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7637073Z 2022-11-23T02:12:26.7637176Z Running tests... 2022-11-23T02:12:26.7637434Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7637740Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7638023Z test_periodic_model_averager_param_group (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7638237Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35085 2022-11-23T02:12:26.7638448Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35086 2022-11-23T02:12:26.7638814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7638982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7639359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7639544Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7639901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7640066Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7640440Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7640620Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7640858Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7641098Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7641493Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7641885Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7642103Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7642325Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7642589Z [1669169246.014886] [08317a7e7676:35085:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7642852Z [1669169246.018426] [08317a7e7676:35086:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7643075Z [1669169246.029143] [08317a7e7676:35085:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7643374Z [1669169246.029143] [08317a7e7676:35085:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7643597Z [1669169246.030695] [08317a7e7676:35086:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7643820Z [1669169246.030695] [08317a7e7676:35086:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7643910Z ok (7.137s) 2022-11-23T02:12:26.7643932Z 2022-11-23T02:12:26.7644193Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7644290Z Ran 1 test in 7.138s 2022-11-23T02:12:26.7644309Z 2022-11-23T02:12:26.7644390Z OK 2022-11-23T02:12:26.7644410Z 2022-11-23T02:12:26.7644521Z Generating XML reports... 2022-11-23T02:12:26.7645022Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020719.xml 2022-11-23T02:12:26.7645392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7645560Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7645936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7646119Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7646139Z 2022-11-23T02:12:26.7646235Z Running tests... 2022-11-23T02:12:26.7646481Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7646788Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7647060Z test_post_localSGD_optimizer_parity (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7647819Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77123 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.660s) 2022-11-23T02:12:26.7647840Z 2022-11-23T02:12:26.7648088Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7648188Z Ran 1 test in 1.661s 2022-11-23T02:12:26.7648207Z 2022-11-23T02:12:26.7648302Z OK (skipped=1) 2022-11-23T02:12:26.7648321Z 2022-11-23T02:12:26.7648434Z Generating XML reports... 2022-11-23T02:12:26.7648877Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020729.xml 2022-11-23T02:12:26.7649234Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7649403Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7649776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7649960Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7649980Z 2022-11-23T02:12:26.7650077Z Running tests... 2022-11-23T02:12:26.7650329Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7650632Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7650919Z test_post_localSGD_optimizer_parity_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7651658Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/77292 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.647s) 2022-11-23T02:12:26.7651681Z 2022-11-23T02:12:26.7651933Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7652027Z Ran 1 test in 1.648s 2022-11-23T02:12:26.7652111Z 2022-11-23T02:12:26.7652215Z OK (skipped=1) 2022-11-23T02:12:26.7652234Z 2022-11-23T02:12:26.7652346Z Generating XML reports... 2022-11-23T02:12:26.7652787Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020733.xml 2022-11-23T02:12:26.7653150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7653316Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7653687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7653920Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7653940Z 2022-11-23T02:12:26.7654035Z Running tests... 2022-11-23T02:12:26.7654281Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7654587Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7654890Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7655103Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35268 2022-11-23T02:12:26.7655313Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35269 2022-11-23T02:12:26.7655675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7655840Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7656212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7656385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7656750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7656916Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7657287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7657465Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7657703Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7657943Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7658336Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7658731Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7658952Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7659171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7659309Z skip: Need at least 4 CUDA devices (4.273s) 2022-11-23T02:12:26.7659329Z 2022-11-23T02:12:26.7659585Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7659684Z Ran 1 test in 4.273s 2022-11-23T02:12:26.7659703Z 2022-11-23T02:12:26.7659797Z OK (skipped=1) 2022-11-23T02:12:26.7659816Z 2022-11-23T02:12:26.7659927Z Generating XML reports... 2022-11-23T02:12:26.7660376Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020737.xml 2022-11-23T02:12:26.7660758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7660919Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7661348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7661545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7661565Z 2022-11-23T02:12:26.7661671Z Running tests... 2022-11-23T02:12:26.7661936Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7662249Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7662578Z test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7662848Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35371 2022-11-23T02:12:26.7663053Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35372 2022-11-23T02:12:26.7663431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7663608Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7663991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7664184Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7664553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7664728Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7665105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7665300Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7665531Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7665778Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7666180Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7666578Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7666810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7667041Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7667193Z skip: Need at least 4 CUDA devices (4.248s) 2022-11-23T02:12:26.7667216Z 2022-11-23T02:12:26.7667481Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7667594Z Ran 1 test in 4.248s 2022-11-23T02:12:26.7667614Z 2022-11-23T02:12:26.7667704Z OK (skipped=1) 2022-11-23T02:12:26.7667723Z 2022-11-23T02:12:26.7667849Z Generating XML reports... 2022-11-23T02:12:26.7668298Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020744.xml 2022-11-23T02:12:26.7668669Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7668846Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7669514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7669710Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7669736Z 2022-11-23T02:12:26.7669846Z Running tests... 2022-11-23T02:12:26.7670095Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7670409Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7670775Z test_post_localSGD_optimizer_step_reload (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7671545Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/84886 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.647s) 2022-11-23T02:12:26.7671566Z 2022-11-23T02:12:26.7671829Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7671942Z Ran 1 test in 1.647s 2022-11-23T02:12:26.7671961Z 2022-11-23T02:12:26.7672069Z OK (skipped=1) 2022-11-23T02:12:26.7672146Z 2022-11-23T02:12:26.7672275Z Generating XML reports... 2022-11-23T02:12:26.7672727Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020751.xml 2022-11-23T02:12:26.7673105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7673264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7673652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7673847Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7673866Z 2022-11-23T02:12:26.7673974Z Running tests... 2022-11-23T02:12:26.7674232Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7674543Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7674818Z test_reduce_full_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7675043Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35508 2022-11-23T02:12:26.7675247Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35509 2022-11-23T02:12:26.7675677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7675856Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7676239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7676431Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7676802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7676978Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7677364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7677558Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7677791Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7678040Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7678445Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7678845Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7679076Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7679319Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.7679548Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7679786Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.7680236Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7680569Z STAGE:2022-11-23 02:07:59 35508:35508 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7680963Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7681295Z STAGE:2022-11-23 02:07:59 35509:35509 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7681573Z [1669169279.608496] [08317a7e7676:35509:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7681858Z [1669169281.250060] [08317a7e7676:35509:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7682099Z [1669169281.250060] [08317a7e7676:35509:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7682373Z [1669169279.584483] [08317a7e7676:35508:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7682604Z [1669169281.272357] [08317a7e7676:35508:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7682844Z [1669169281.272357] [08317a7e7676:35508:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7683402Z STAGE:2022-11-23 02:08:01 35509:35509 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:08:01 35508:35508 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7683426Z 2022-11-23T02:12:26.7683780Z STAGE:2022-11-23 02:08:01 35509:35509 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7684113Z STAGE:2022-11-23 02:08:01 35508:35508 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7684442Z STAGE:2022-11-23 02:08:01 35509:35509 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7684765Z STAGE:2022-11-23 02:08:01 35508:35508 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7685106Z STAGE:2022-11-23 02:08:01 35509:35509 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7685437Z STAGE:2022-11-23 02:08:01 35508:35508 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7685785Z STAGE:2022-11-23 02:08:01 35509:35509 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7686138Z STAGE:2022-11-23 02:08:01 35508:35508 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7686242Z ok (6.648s) 2022-11-23T02:12:26.7686262Z 2022-11-23T02:12:26.7686526Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7686625Z Ran 1 test in 6.648s 2022-11-23T02:12:26.7686644Z 2022-11-23T02:12:26.7686735Z OK 2022-11-23T02:12:26.7686754Z 2022-11-23T02:12:26.7686875Z Generating XML reports... 2022-11-23T02:12:26.7687331Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020755.xml 2022-11-23T02:12:26.7687703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7687880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7688268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7688466Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7688486Z 2022-11-23T02:12:26.7688575Z Running tests... 2022-11-23T02:12:26.7688844Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7689214Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7689490Z test_reduce_full_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7689713Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35622 2022-11-23T02:12:26.7689976Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35623 2022-11-23T02:12:26.7690354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7690530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7690970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7691145Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7691521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7691697Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7692081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7692273Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7692521Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7692768Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7693178Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7693563Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7693800Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7694047Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.7694273Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7694514Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.7694912Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7695306Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7695649Z STAGE:2022-11-23 02:08:08 35622:35622 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7695974Z STAGE:2022-11-23 02:08:08 35623:35623 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7696257Z [1669169288.850120] [08317a7e7676:35623:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7696471Z [1669169290.506821] [08317a7e7676:35623:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7696710Z [1669169290.506821] [08317a7e7676:35623:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7696984Z [1669169288.828857] [08317a7e7676:35622:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7697218Z [1669169290.529872] [08317a7e7676:35622:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7697454Z [1669169290.529872] [08317a7e7676:35622:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7698063Z STAGE:2022-11-23 02:08:10 35623:35623 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:08:10 35622:35622 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7698087Z 2022-11-23T02:12:26.7698445Z STAGE:2022-11-23 02:08:10 35623:35623 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7698797Z STAGE:2022-11-23 02:08:10 35622:35622 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7699122Z STAGE:2022-11-23 02:08:11 35623:35623 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7699446Z STAGE:2022-11-23 02:08:11 35622:35622 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7699815Z STAGE:2022-11-23 02:08:11 35623:35623 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7700155Z STAGE:2022-11-23 02:08:11 35622:35622 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7700505Z STAGE:2022-11-23 02:08:11 35623:35623 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7700851Z STAGE:2022-11-23 02:08:11 35622:35622 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7700953Z ok (6.769s) 2022-11-23T02:12:26.7700972Z 2022-11-23T02:12:26.7701237Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7701349Z Ran 1 test in 6.770s 2022-11-23T02:12:26.7701368Z 2022-11-23T02:12:26.7701460Z OK 2022-11-23T02:12:26.7701479Z 2022-11-23T02:12:26.7701602Z Generating XML reports... 2022-11-23T02:12:26.7702036Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020804.xml 2022-11-23T02:12:26.7702414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7702595Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7702978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7703169Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7703189Z 2022-11-23T02:12:26.7703295Z Running tests... 2022-11-23T02:12:26.7703557Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7703870Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7704127Z test_reduce_full_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7704354Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35736 2022-11-23T02:12:26.7704574Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35737 2022-11-23T02:12:26.7704951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7705126Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7705507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7705699Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7706070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7706247Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7706615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7706812Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7707059Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7707351Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7707762Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7708160Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7708392Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7708637Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.7708916Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7709363Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.7709777Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7710169Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7710504Z STAGE:2022-11-23 02:08:18 35737:35737 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7710828Z STAGE:2022-11-23 02:08:18 35736:35736 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7711105Z [1669169298.199145] [08317a7e7676:35736:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7711335Z [1669169299.818818] [08317a7e7676:35736:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7711575Z [1669169299.818818] [08317a7e7676:35736:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7711852Z [1669169298.220566] [08317a7e7676:35737:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7712081Z [1669169299.863302] [08317a7e7676:35737:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7712301Z [1669169299.863302] [08317a7e7676:35737:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7712859Z STAGE:2022-11-23 02:08:20 35736:35736 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:08:20 35737:35737 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7712884Z 2022-11-23T02:12:26.7713234Z STAGE:2022-11-23 02:08:20 35737:35737 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7713578Z STAGE:2022-11-23 02:08:20 35736:35736 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7713905Z STAGE:2022-11-23 02:08:20 35737:35737 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7714225Z STAGE:2022-11-23 02:08:20 35736:35736 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7714558Z STAGE:2022-11-23 02:08:20 35737:35737 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7714887Z STAGE:2022-11-23 02:08:20 35736:35736 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7715231Z STAGE:2022-11-23 02:08:20 35737:35737 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7715582Z STAGE:2022-11-23 02:08:20 35736:35736 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7715670Z ok (6.673s) 2022-11-23T02:12:26.7715690Z 2022-11-23T02:12:26.7715954Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7716066Z Ran 1 test in 6.673s 2022-11-23T02:12:26.7716085Z 2022-11-23T02:12:26.7716263Z OK 2022-11-23T02:12:26.7716285Z 2022-11-23T02:12:26.7716413Z Generating XML reports... 2022-11-23T02:12:26.7716869Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020814.xml 2022-11-23T02:12:26.7717249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7717425Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7717794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7717984Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7718065Z 2022-11-23T02:12:26.7718173Z Running tests... 2022-11-23T02:12:26.7718435Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7718753Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7719020Z test_reduce_full_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7719242Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35850 2022-11-23T02:12:26.7719458Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35851 2022-11-23T02:12:26.7719816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7719993Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7720375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7720569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7720939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7721112Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7721493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7721687Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7721931Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7722161Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7722566Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7722971Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7723203Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7723450Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.7723674Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7723913Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.7724316Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7724653Z STAGE:2022-11-23 02:08:27 35850:35850 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7725035Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.7725374Z STAGE:2022-11-23 02:08:27 35851:35851 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7725694Z [1669169307.479345] [08317a7e7676:35851:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7725932Z [1669169309.078137] [08317a7e7676:35851:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7726167Z [1669169309.078137] [08317a7e7676:35851:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7726439Z [1669169307.457717] [08317a7e7676:35850:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7726666Z [1669169309.115399] [08317a7e7676:35850:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7726948Z [1669169309.115399] [08317a7e7676:35850:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7727507Z STAGE:2022-11-23 02:08:29 35851:35851 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:08:29 35850:35850 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7727529Z 2022-11-23T02:12:26.7727877Z STAGE:2022-11-23 02:08:29 35851:35851 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7728221Z STAGE:2022-11-23 02:08:29 35850:35850 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7728532Z STAGE:2022-11-23 02:08:29 35851:35851 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7728853Z STAGE:2022-11-23 02:08:29 35850:35850 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7729190Z STAGE:2022-11-23 02:08:29 35851:35851 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7729533Z STAGE:2022-11-23 02:08:29 35851:35851 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7729870Z STAGE:2022-11-23 02:08:29 35850:35850 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7730214Z STAGE:2022-11-23 02:08:29 35850:35850 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7730313Z ok (6.752s) 2022-11-23T02:12:26.7730332Z 2022-11-23T02:12:26.7730596Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7730690Z Ran 1 test in 6.752s 2022-11-23T02:12:26.7730726Z 2022-11-23T02:12:26.7730800Z OK 2022-11-23T02:12:26.7730819Z 2022-11-23T02:12:26.7730942Z Generating XML reports... 2022-11-23T02:12:26.7731397Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020823.xml 2022-11-23T02:12:26.7731777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7731954Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7732340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7732531Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7732550Z 2022-11-23T02:12:26.7732655Z Running tests... 2022-11-23T02:12:26.7732901Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7733214Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7733471Z test_reduce_group_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7733693Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 35964 2022-11-23T02:12:26.7733917Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 35965 2022-11-23T02:12:26.7734289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7734510Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7734897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7735087Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7735442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7735617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7735998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7736238Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7736486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7736736Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7737141Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7737538Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7737751Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7737981Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7738138Z skip: Skipped due to small world size. (4.241s) 2022-11-23T02:12:26.7738158Z 2022-11-23T02:12:26.7738427Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7738537Z Ran 1 test in 4.241s 2022-11-23T02:12:26.7738557Z 2022-11-23T02:12:26.7738662Z OK (skipped=1) 2022-11-23T02:12:26.7738680Z 2022-11-23T02:12:26.7738801Z Generating XML reports... 2022-11-23T02:12:26.7739251Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020832.xml 2022-11-23T02:12:26.7739625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7739785Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7740169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7740358Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7740377Z 2022-11-23T02:12:26.7740479Z Running tests... 2022-11-23T02:12:26.7740745Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7741060Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7741320Z test_reduce_group_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7741542Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36067 2022-11-23T02:12:26.7741746Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36068 2022-11-23T02:12:26.7742121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7742296Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7742673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7742861Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7743235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7743408Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7743835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7744033Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7744263Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7744508Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7744912Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7745313Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7745593Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7745821Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7745982Z skip: Skipped due to small world size. (4.330s) 2022-11-23T02:12:26.7746002Z 2022-11-23T02:12:26.7746264Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7746373Z Ran 1 test in 4.330s 2022-11-23T02:12:26.7746392Z 2022-11-23T02:12:26.7746481Z OK (skipped=1) 2022-11-23T02:12:26.7746500Z 2022-11-23T02:12:26.7746623Z Generating XML reports... 2022-11-23T02:12:26.7747070Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020839.xml 2022-11-23T02:12:26.7747444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7747624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7748006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7748191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7748213Z 2022-11-23T02:12:26.7748321Z Running tests... 2022-11-23T02:12:26.7748567Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7748880Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7749447Z test_reduce_group_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7749673Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36170 2022-11-23T02:12:26.7749886Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36171 2022-11-23T02:12:26.7750260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7750437Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7750817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7751010Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7751365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7751539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7751920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7752108Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7752354Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7752604Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7753007Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7753528Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7753768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7753982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7754139Z skip: Skipped due to small world size. (4.215s) 2022-11-23T02:12:26.7754159Z 2022-11-23T02:12:26.7754429Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7754537Z Ran 1 test in 4.215s 2022-11-23T02:12:26.7754556Z 2022-11-23T02:12:26.7754659Z OK (skipped=1) 2022-11-23T02:12:26.7754734Z 2022-11-23T02:12:26.7754861Z Generating XML reports... 2022-11-23T02:12:26.7755312Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020846.xml 2022-11-23T02:12:26.7755689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7755847Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7756231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7756419Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7756438Z 2022-11-23T02:12:26.7756546Z Running tests... 2022-11-23T02:12:26.7756810Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7757119Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7757375Z test_reduce_group_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7757591Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36273 2022-11-23T02:12:26.7757812Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36274 2022-11-23T02:12:26.7758169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7758339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7758714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7758900Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7759266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7759438Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7759820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7760006Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7760238Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7760486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7760889Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7761289Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7761518Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7761745Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7761903Z skip: Skipped due to small world size. (4.294s) 2022-11-23T02:12:26.7761923Z 2022-11-23T02:12:26.7762182Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7762342Z Ran 1 test in 4.294s 2022-11-23T02:12:26.7762363Z 2022-11-23T02:12:26.7762456Z OK (skipped=1) 2022-11-23T02:12:26.7762475Z 2022-11-23T02:12:26.7762596Z Generating XML reports... 2022-11-23T02:12:26.7763049Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020853.xml 2022-11-23T02:12:26.7763418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7763591Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7763970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7764208Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7764228Z 2022-11-23T02:12:26.7764332Z Running tests... 2022-11-23T02:12:26.7764597Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7764900Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7765149Z test_reduce_max (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7765366Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36376 2022-11-23T02:12:26.7765582Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36377 2022-11-23T02:12:26.7765951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7766124Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7766506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7766702Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7767061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7767234Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7767610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7767794Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7768033Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7768279Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7768680Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7769078Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7769311Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7769632Z STAGE:2022-11-23 02:09:03 36377:36377 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7769864Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7770199Z STAGE:2022-11-23 02:09:04 36376:36376 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7770469Z [1669169344.079609] [08317a7e7676:36377:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7770700Z [1669169345.700892] [08317a7e7676:36377:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7770934Z [1669169345.700892] [08317a7e7676:36377:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7771247Z [1669169344.055623] [08317a7e7676:36376:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7771479Z [1669169345.740853] [08317a7e7676:36376:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7771714Z [1669169345.740853] [08317a7e7676:36376:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7772265Z STAGE:2022-11-23 02:09:06 36377:36377 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:09:06 36376:36376 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7772285Z 2022-11-23T02:12:26.7772684Z STAGE:2022-11-23 02:09:06 36377:36377 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7773015Z STAGE:2022-11-23 02:09:06 36376:36376 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7773344Z STAGE:2022-11-23 02:09:06 36377:36377 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7773665Z STAGE:2022-11-23 02:09:06 36376:36376 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7773994Z STAGE:2022-11-23 02:09:06 36377:36377 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7774320Z STAGE:2022-11-23 02:09:06 36376:36376 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7774662Z STAGE:2022-11-23 02:09:06 36377:36377 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7775001Z STAGE:2022-11-23 02:09:06 36376:36376 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7775107Z ok (6.711s) 2022-11-23T02:12:26.7775126Z 2022-11-23T02:12:26.7775391Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7775486Z Ran 1 test in 6.711s 2022-11-23T02:12:26.7775506Z 2022-11-23T02:12:26.7775594Z OK 2022-11-23T02:12:26.7775614Z 2022-11-23T02:12:26.7775740Z Generating XML reports... 2022-11-23T02:12:26.7776193Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020900.xml 2022-11-23T02:12:26.7776569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7776745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7777128Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7777317Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7777340Z 2022-11-23T02:12:26.7777430Z Running tests... 2022-11-23T02:12:26.7777689Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7778000Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7778250Z test_reduce_min (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7778473Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36490 2022-11-23T02:12:26.7778687Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36491 2022-11-23T02:12:26.7779054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7779227Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7779593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7779784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7780153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7780384Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7780770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7780958Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7781201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7781447Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7781846Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7782280Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7782510Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7782844Z STAGE:2022-11-23 02:09:13 36491:36491 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7783071Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7783399Z STAGE:2022-11-23 02:09:13 36490:36490 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7783671Z [1669169353.233939] [08317a7e7676:36490:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7783902Z [1669169354.918310] [08317a7e7676:36490:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7784135Z [1669169354.918310] [08317a7e7676:36490:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7784409Z [1669169353.235675] [08317a7e7676:36491:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7784639Z [1669169354.889768] [08317a7e7676:36491:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7784857Z [1669169354.889768] [08317a7e7676:36491:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7785412Z STAGE:2022-11-23 02:09:15 36490:36490 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:09:15 36491:36491 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7785432Z 2022-11-23T02:12:26.7785782Z STAGE:2022-11-23 02:09:15 36491:36491 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7786134Z STAGE:2022-11-23 02:09:15 36490:36490 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7786459Z STAGE:2022-11-23 02:09:15 36491:36491 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7786786Z STAGE:2022-11-23 02:09:15 36490:36490 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7787114Z STAGE:2022-11-23 02:09:15 36491:36491 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7787442Z STAGE:2022-11-23 02:09:15 36490:36490 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7787787Z STAGE:2022-11-23 02:09:15 36491:36491 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7788129Z STAGE:2022-11-23 02:09:15 36490:36490 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7788215Z ok (6.729s) 2022-11-23T02:12:26.7788234Z 2022-11-23T02:12:26.7788506Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7788614Z Ran 1 test in 6.729s 2022-11-23T02:12:26.7788633Z 2022-11-23T02:12:26.7788722Z OK 2022-11-23T02:12:26.7788741Z 2022-11-23T02:12:26.7788864Z Generating XML reports... 2022-11-23T02:12:26.7789629Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020909.xml 2022-11-23T02:12:26.7790053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7790227Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7790598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7790789Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7790808Z 2022-11-23T02:12:26.7790913Z Running tests... 2022-11-23T02:12:26.7791172Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7791555Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7791831Z test_reduce_multigpu (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl backend supports reduce multigpu (0.002s) 2022-11-23T02:12:26.7791854Z 2022-11-23T02:12:26.7792114Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7792223Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7792242Z 2022-11-23T02:12:26.7792345Z OK (skipped=1) 2022-11-23T02:12:26.7792364Z 2022-11-23T02:12:26.7792469Z Generating XML reports... 2022-11-23T02:12:26.7792916Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020918.xml 2022-11-23T02:12:26.7793286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7793453Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7793839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7794028Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7794047Z 2022-11-23T02:12:26.7794154Z Running tests... 2022-11-23T02:12:26.7794415Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7794725Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7794966Z test_reduce_product (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7795184Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36637 2022-11-23T02:12:26.7795403Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36638 2022-11-23T02:12:26.7795774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7795948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7796330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7796520Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7796893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7797051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7797432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7797619Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7797864Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7798115Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7798517Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7798967Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7799202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7799537Z STAGE:2022-11-23 02:09:24 36638:36638 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7799751Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7800084Z STAGE:2022-11-23 02:09:24 36637:36637 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7800361Z [1669169364.996738] [08317a7e7676:36638:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7800640Z [1669169366.623027] [08317a7e7676:36638:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7800880Z [1669169366.623027] [08317a7e7676:36638:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7801147Z [1669169364.973870] [08317a7e7676:36637:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7801373Z [1669169366.654095] [08317a7e7676:36637:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7801608Z [1669169366.654095] [08317a7e7676:36637:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7802166Z STAGE:2022-11-23 02:09:27 36638:36638 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:09:27 36637:36637 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7802191Z 2022-11-23T02:12:26.7802542Z STAGE:2022-11-23 02:09:27 36638:36638 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7802892Z STAGE:2022-11-23 02:09:27 36637:36637 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7803205Z STAGE:2022-11-23 02:09:27 36638:36638 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7803529Z STAGE:2022-11-23 02:09:27 36637:36637 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7803860Z STAGE:2022-11-23 02:09:27 36638:36638 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7804201Z STAGE:2022-11-23 02:09:27 36638:36638 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7804534Z STAGE:2022-11-23 02:09:27 36637:36637 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7804875Z STAGE:2022-11-23 02:09:27 36637:36637 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7804976Z ok (6.678s) 2022-11-23T02:12:26.7804995Z 2022-11-23T02:12:26.7805263Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7805357Z Ran 1 test in 6.678s 2022-11-23T02:12:26.7805393Z 2022-11-23T02:12:26.7805467Z OK 2022-11-23T02:12:26.7805486Z 2022-11-23T02:12:26.7805610Z Generating XML reports... 2022-11-23T02:12:26.7806062Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020921.xml 2022-11-23T02:12:26.7806436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7806609Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7806991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7807185Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7807205Z 2022-11-23T02:12:26.7807309Z Running tests... 2022-11-23T02:12:26.7807614Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7807935Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7808230Z test_reduce_scatter_tensor_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA reduce_scatter_tensor (0.002s) 2022-11-23T02:12:26.7808249Z 2022-11-23T02:12:26.7808509Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7808623Z Ran 1 test in 0.003s 2022-11-23T02:12:26.7808642Z 2022-11-23T02:12:26.7808744Z OK (skipped=1) 2022-11-23T02:12:26.7808763Z 2022-11-23T02:12:26.7808883Z Generating XML reports... 2022-11-23T02:12:26.7809332Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020930.xml 2022-11-23T02:12:26.7809759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7809922Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7810308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7810493Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7810512Z 2022-11-23T02:12:26.7810613Z Running tests... 2022-11-23T02:12:26.7810877Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7811190Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7811463Z test_reduce_scatter_v_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports reduce_scatter_v (0.003s) 2022-11-23T02:12:26.7811485Z 2022-11-23T02:12:26.7811745Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7811838Z Ran 1 test in 0.003s 2022-11-23T02:12:26.7811875Z 2022-11-23T02:12:26.7811964Z OK (skipped=1) 2022-11-23T02:12:26.7811983Z 2022-11-23T02:12:26.7812108Z Generating XML reports... 2022-11-23T02:12:26.7812558Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020932.xml 2022-11-23T02:12:26.7812933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7813109Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7813491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7813679Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7813702Z 2022-11-23T02:12:26.7813807Z Running tests... 2022-11-23T02:12:26.7814050Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7814362Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7814611Z test_reduce_sum (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7814834Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36817 2022-11-23T02:12:26.7815052Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36818 2022-11-23T02:12:26.7815420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7815591Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7815970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7816162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7816518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7816691Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7817118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7817311Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7817556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7817802Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7818206Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7818605Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7818869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7819210Z STAGE:2022-11-23 02:09:38 36817:36817 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7819436Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7819768Z STAGE:2022-11-23 02:09:39 36818:36818 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7820043Z [1669169379.033695] [08317a7e7676:36817:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7820273Z [1669169380.673797] [08317a7e7676:36817:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7820504Z [1669169380.673797] [08317a7e7676:36817:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7820778Z [1669169379.034559] [08317a7e7676:36818:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7821007Z [1669169380.700277] [08317a7e7676:36818:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7821236Z [1669169380.700277] [08317a7e7676:36818:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7821778Z STAGE:2022-11-23 02:09:41 36817:36817 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:09:41 36818:36818 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7821813Z 2022-11-23T02:12:26.7822150Z STAGE:2022-11-23 02:09:41 36818:36818 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7822508Z STAGE:2022-11-23 02:09:41 36817:36817 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7822837Z STAGE:2022-11-23 02:09:41 36818:36818 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7823156Z STAGE:2022-11-23 02:09:41 36817:36817 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7823489Z STAGE:2022-11-23 02:09:41 36818:36818 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7823816Z STAGE:2022-11-23 02:09:41 36817:36817 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7824160Z STAGE:2022-11-23 02:09:41 36818:36818 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7824504Z STAGE:2022-11-23 02:09:41 36817:36817 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7824605Z ok (6.723s) 2022-11-23T02:12:26.7824624Z 2022-11-23T02:12:26.7824872Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7824986Z Ran 1 test in 6.723s 2022-11-23T02:12:26.7825005Z 2022-11-23T02:12:26.7825094Z OK 2022-11-23T02:12:26.7825112Z 2022-11-23T02:12:26.7825229Z Generating XML reports... 2022-11-23T02:12:26.7825720Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020935.xml 2022-11-23T02:12:26.7826103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7826280Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7826664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7826840Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7826868Z 2022-11-23T02:12:26.7826959Z Running tests... 2022-11-23T02:12:26.7827221Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7827579Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7827837Z test_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA reduce (0.002s) 2022-11-23T02:12:26.7827857Z 2022-11-23T02:12:26.7828118Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7828228Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7828247Z 2022-11-23T02:12:26.7828350Z OK (skipped=1) 2022-11-23T02:12:26.7828369Z 2022-11-23T02:12:26.7828489Z Generating XML reports... 2022-11-23T02:12:26.7828920Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020944.xml 2022-11-23T02:12:26.7829529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7829704Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7830088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7830277Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7830297Z 2022-11-23T02:12:26.7830402Z Running tests... 2022-11-23T02:12:26.7830664Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7830976Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7831240Z test_reduce_sum_cuda_twice (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA reduce (0.002s) 2022-11-23T02:12:26.7831259Z 2022-11-23T02:12:26.7831502Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7831611Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7831630Z 2022-11-23T02:12:26.7831736Z OK (skipped=1) 2022-11-23T02:12:26.7831755Z 2022-11-23T02:12:26.7831872Z Generating XML reports... 2022-11-23T02:12:26.7832320Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020946.xml 2022-11-23T02:12:26.7832696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7832871Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7833255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7833445Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7833465Z 2022-11-23T02:12:26.7833554Z Running tests... 2022-11-23T02:12:26.7833810Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7834118Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7834371Z test_reduce_sum_twice (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7834593Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 36997 2022-11-23T02:12:26.7834815Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 36998 2022-11-23T02:12:26.7835256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7835438Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7835805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7835993Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7836361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7836533Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7836979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7837166Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7837415Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7837664Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7838063Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7838447Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7838681Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7839018Z STAGE:2022-11-23 02:09:52 36998:36998 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7839247Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7839577Z STAGE:2022-11-23 02:09:53 36997:36997 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7839858Z [1669169393.031446] [08317a7e7676:36998:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7840085Z [1669169394.656756] [08317a7e7676:36998:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7840322Z [1669169394.656756] [08317a7e7676:36998:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7840596Z [1669169393.025265] [08317a7e7676:36997:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7840808Z [1669169394.669378] [08317a7e7676:36997:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7841043Z [1669169394.669378] [08317a7e7676:36997:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7841596Z STAGE:2022-11-23 02:09:55 36998:36998 ActivityProfilerController.cpp:306] Completed Stage: CollectionSTAGE:2022-11-23 02:09:55 36997:36997 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7841618Z 2022-11-23T02:12:26.7841969Z STAGE:2022-11-23 02:09:55 36998:36998 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7842310Z STAGE:2022-11-23 02:09:55 36997:36997 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7842630Z STAGE:2022-11-23 02:09:55 36998:36998 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7842948Z STAGE:2022-11-23 02:09:55 36997:36997 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7843283Z STAGE:2022-11-23 02:09:55 36998:36998 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7843628Z STAGE:2022-11-23 02:09:55 36998:36998 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7844030Z STAGE:2022-11-23 02:09:55 36997:36997 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7844369Z STAGE:2022-11-23 02:09:55 36997:36997 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7844470Z ok (6.509s) 2022-11-23T02:12:26.7844491Z 2022-11-23T02:12:26.7844752Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7844858Z Ran 1 test in 6.509s 2022-11-23T02:12:26.7844877Z 2022-11-23T02:12:26.7844966Z OK 2022-11-23T02:12:26.7844984Z 2022-11-23T02:12:26.7845108Z Generating XML reports... 2022-11-23T02:12:26.7845560Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020949.xml 2022-11-23T02:12:26.7845989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7846166Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7846538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7846732Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7846752Z 2022-11-23T02:12:26.7846852Z Running tests... 2022-11-23T02:12:26.7847107Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7847418Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7847675Z test_scatter (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:12:26.7847695Z 2022-11-23T02:12:26.7847955Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7848063Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7848082Z 2022-11-23T02:12:26.7848172Z OK (skipped=1) 2022-11-23T02:12:26.7848214Z 2022-11-23T02:12:26.7848320Z Generating XML reports... 2022-11-23T02:12:26.7848768Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020958.xml 2022-11-23T02:12:26.7849142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7849315Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7849694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7849881Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7849900Z 2022-11-23T02:12:26.7850004Z Running tests... 2022-11-23T02:12:26.7850266Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7850561Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7850835Z test_scatter_checks (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:12:26.7850855Z 2022-11-23T02:12:26.7851113Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7851221Z Ran 1 test in 0.003s 2022-11-23T02:12:26.7851240Z 2022-11-23T02:12:26.7851343Z OK (skipped=1) 2022-11-23T02:12:26.7851362Z 2022-11-23T02:12:26.7851479Z Generating XML reports... 2022-11-23T02:12:26.7851922Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021000.xml 2022-11-23T02:12:26.7852296Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7852475Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7852839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7853030Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7853095Z 2022-11-23T02:12:26.7853204Z Running tests... 2022-11-23T02:12:26.7853468Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7853779Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7854050Z test_scatter_complex (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:12:26.7854069Z 2022-11-23T02:12:26.7854327Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7854434Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7854454Z 2022-11-23T02:12:26.7854558Z OK (skipped=1) 2022-11-23T02:12:26.7854623Z 2022-11-23T02:12:26.7854732Z Generating XML reports... 2022-11-23T02:12:26.7855178Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021003.xml 2022-11-23T02:12:26.7855551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7855727Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7856112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7856300Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7856319Z 2022-11-23T02:12:26.7856424Z Running tests... 2022-11-23T02:12:26.7856683Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7856979Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7857238Z test_scatter_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA gather (0.002s) 2022-11-23T02:12:26.7857258Z 2022-11-23T02:12:26.7857515Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7857621Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7857640Z 2022-11-23T02:12:26.7857748Z OK (skipped=1) 2022-11-23T02:12:26.7857767Z 2022-11-23T02:12:26.7857884Z Generating XML reports... 2022-11-23T02:12:26.7858331Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021005.xml 2022-11-23T02:12:26.7858705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7858880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7859247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7859444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7859463Z 2022-11-23T02:12:26.7859566Z Running tests... 2022-11-23T02:12:26.7859825Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7860134Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7860400Z test_scatter_cuda_complex (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl supports CUDA gather (0.002s) 2022-11-23T02:12:26.7860420Z 2022-11-23T02:12:26.7860677Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7860786Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7860805Z 2022-11-23T02:12:26.7860910Z OK (skipped=1) 2022-11-23T02:12:26.7860928Z 2022-11-23T02:12:26.7861034Z Generating XML reports... 2022-11-23T02:12:26.7861483Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021007.xml 2022-11-23T02:12:26.7861861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7862034Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7862457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7862652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7862672Z 2022-11-23T02:12:26.7862778Z Running tests... 2022-11-23T02:12:26.7863036Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7863332Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7863600Z test_scatter_full_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:12:26.7863619Z 2022-11-23T02:12:26.7863879Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7864035Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7864054Z 2022-11-23T02:12:26.7864158Z OK (skipped=1) 2022-11-23T02:12:26.7864177Z 2022-11-23T02:12:26.7864297Z Generating XML reports... 2022-11-23T02:12:26.7864742Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021010.xml 2022-11-23T02:12:26.7865146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7865460Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7866186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7866438Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7866460Z 2022-11-23T02:12:26.7866569Z Running tests... 2022-11-23T02:12:26.7866836Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7867147Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7867416Z test_scatter_group (__main__.TestDistBackendWithSpawn) ... skip: CPU tensor ops not supported by UCP TL (0.002s) 2022-11-23T02:12:26.7867436Z 2022-11-23T02:12:26.7867699Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7867806Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7867825Z 2022-11-23T02:12:26.7867927Z OK (skipped=1) 2022-11-23T02:12:26.7867946Z 2022-11-23T02:12:26.7868050Z Generating XML reports... 2022-11-23T02:12:26.7868496Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021012.xml 2022-11-23T02:12:26.7868867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7869268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7869667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7869851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7869871Z 2022-11-23T02:12:26.7869977Z Running tests... 2022-11-23T02:12:26.7870239Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7870534Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7870919Z test_scatter_object_list (__main__.TestDistBackendWithSpawn) ... skip: Test requires backend to be one of {'gloo'} (0.002s) 2022-11-23T02:12:26.7870938Z 2022-11-23T02:12:26.7871192Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7871301Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7871321Z 2022-11-23T02:12:26.7871423Z OK (skipped=1) 2022-11-23T02:12:26.7871442Z 2022-11-23T02:12:26.7871566Z Generating XML reports... 2022-11-23T02:12:26.7872012Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021015.xml 2022-11-23T02:12:26.7872378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7872643Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7873021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7873209Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7873228Z 2022-11-23T02:12:26.7873334Z Running tests... 2022-11-23T02:12:26.7873593Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7873900Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7874146Z test_send_recv (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7874430Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37375 2022-11-23T02:12:26.7874646Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37376 2022-11-23T02:12:26.7875023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7875183Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7875567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7875759Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7876132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7876306Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7876687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7876872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7877121Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7877353Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7877752Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7878150Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7878376Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7878607Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7878885Z [1669169421.334259] [08317a7e7676:37376:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7879120Z [1669169422.747904] [08317a7e7676:37376:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7879356Z [1669169422.747904] [08317a7e7676:37376:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7879629Z [1669169421.331702] [08317a7e7676:37375:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7879854Z [1669169422.765078] [08317a7e7676:37375:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7880071Z [1669169422.765078] [08317a7e7676:37375:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7880170Z ok (6.151s) 2022-11-23T02:12:26.7880191Z 2022-11-23T02:12:26.7880453Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7880561Z Ran 1 test in 6.151s 2022-11-23T02:12:26.7880580Z 2022-11-23T02:12:26.7880665Z OK 2022-11-23T02:12:26.7880684Z 2022-11-23T02:12:26.7880847Z Generating XML reports... 2022-11-23T02:12:26.7881307Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021017.xml 2022-11-23T02:12:26.7881686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7881862Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7882227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7882417Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7882480Z 2022-11-23T02:12:26.7882587Z Running tests... 2022-11-23T02:12:26.7882847Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7883158Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7883442Z test_send_recv_any_source (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support send/recv from any source (0.002s) 2022-11-23T02:12:26.7883462Z 2022-11-23T02:12:26.7883722Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7883830Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7883849Z 2022-11-23T02:12:26.7883954Z OK (skipped=1) 2022-11-23T02:12:26.7883973Z 2022-11-23T02:12:26.7884078Z Generating XML reports... 2022-11-23T02:12:26.7884522Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021026.xml 2022-11-23T02:12:26.7884892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7885070Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7885448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7885639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7885659Z 2022-11-23T02:12:26.7885767Z Running tests... 2022-11-23T02:12:26.7886026Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7886322Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7886637Z test_send_recv_any_source_autograd_profiler (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support send/recv from any source (0.002s) 2022-11-23T02:12:26.7886658Z 2022-11-23T02:12:26.7886920Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7887030Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7887049Z 2022-11-23T02:12:26.7887154Z OK (skipped=1) 2022-11-23T02:12:26.7887172Z 2022-11-23T02:12:26.7887290Z Generating XML reports... 2022-11-23T02:12:26.7887741Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021028.xml 2022-11-23T02:12:26.7888118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7888291Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7888657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7888848Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7888867Z 2022-11-23T02:12:26.7888975Z Running tests... 2022-11-23T02:12:26.7889237Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7889556Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7889905Z test_send_recv_any_source_torch_profiler (__main__.TestDistBackendWithSpawn) ... skip: ucc does not support send/recv from any source (0.002s) 2022-11-23T02:12:26.7889925Z 2022-11-23T02:12:26.7890230Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7890343Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7890362Z 2022-11-23T02:12:26.7890464Z OK (skipped=1) 2022-11-23T02:12:26.7890483Z 2022-11-23T02:12:26.7890587Z Generating XML reports... 2022-11-23T02:12:26.7891029Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021031.xml 2022-11-23T02:12:26.7891403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7891574Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7892036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7892227Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7892248Z 2022-11-23T02:12:26.7892356Z Running tests... 2022-11-23T02:12:26.7892619Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7892933Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7893196Z test_send_recv_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7893417Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37584 2022-11-23T02:12:26.7893631Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37585 2022-11-23T02:12:26.7893993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7894168Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7894549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7894740Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7895105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7895261Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7895634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7895819Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7896063Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7896303Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7896709Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7897108Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7897334Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7897667Z STAGE:2022-11-23 02:10:37 37584:37584 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7897879Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7898214Z STAGE:2022-11-23 02:10:37 37585:37585 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7898486Z [1669169437.418811] [08317a7e7676:37585:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7898718Z [1669169439.021760] [08317a7e7676:37585:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7899001Z [1669169439.021760] [08317a7e7676:37585:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7899344Z STAGE:2022-11-23 02:10:39 37585:37585 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7899615Z [1669169437.398212] [08317a7e7676:37584:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7899839Z [1669169439.033901] [08317a7e7676:37584:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7900069Z [1669169439.033901] [08317a7e7676:37584:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7900457Z STAGE:2022-11-23 02:10:39 37584:37584 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7900790Z STAGE:2022-11-23 02:10:39 37585:37585 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7901137Z STAGE:2022-11-23 02:10:39 37584:37584 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7901237Z ok (6.659s) 2022-11-23T02:12:26.7901257Z 2022-11-23T02:12:26.7901519Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7901625Z Ran 1 test in 6.660s 2022-11-23T02:12:26.7901644Z 2022-11-23T02:12:26.7901733Z OK 2022-11-23T02:12:26.7901752Z 2022-11-23T02:12:26.7901869Z Generating XML reports... 2022-11-23T02:12:26.7902316Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021033.xml 2022-11-23T02:12:26.7902677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7902861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7903242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7903437Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7903456Z 2022-11-23T02:12:26.7903561Z Running tests... 2022-11-23T02:12:26.7903823Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7904134Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7904372Z test_send_recv_nccl (__main__.TestDistBackendWithSpawn) ... skip: NCCL Send Recv Only (0.002s) 2022-11-23T02:12:26.7904392Z 2022-11-23T02:12:26.7904651Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7904744Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7904763Z 2022-11-23T02:12:26.7904872Z OK (skipped=1) 2022-11-23T02:12:26.7904891Z 2022-11-23T02:12:26.7905012Z Generating XML reports... 2022-11-23T02:12:26.7905456Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021042.xml 2022-11-23T02:12:26.7905830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7906006Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7906387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7906576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7906595Z 2022-11-23T02:12:26.7906701Z Running tests... 2022-11-23T02:12:26.7906945Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7907255Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7907522Z test_send_recv_nccl_autograd_profiler (__main__.TestDistBackendWithSpawn) ... skip: NCCL Send Recv Only (0.002s) 2022-11-23T02:12:26.7907542Z 2022-11-23T02:12:26.7907796Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7907962Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7907984Z 2022-11-23T02:12:26.7908087Z OK (skipped=1) 2022-11-23T02:12:26.7908105Z 2022-11-23T02:12:26.7908225Z Generating XML reports... 2022-11-23T02:12:26.7908673Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021045.xml 2022-11-23T02:12:26.7909262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7909428Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7909819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7910089Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7910109Z 2022-11-23T02:12:26.7910210Z Running tests... 2022-11-23T02:12:26.7910463Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7910778Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7911039Z test_send_recv_nccl_torch_profiler (__main__.TestDistBackendWithSpawn) ... skip: NCCL Send Recv Only (0.002s) 2022-11-23T02:12:26.7911058Z 2022-11-23T02:12:26.7911309Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7911401Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7911437Z 2022-11-23T02:12:26.7911527Z OK (skipped=1) 2022-11-23T02:12:26.7911546Z 2022-11-23T02:12:26.7911668Z Generating XML reports... 2022-11-23T02:12:26.7912108Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021047.xml 2022-11-23T02:12:26.7912477Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7912649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7913032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7913221Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7913241Z 2022-11-23T02:12:26.7913346Z Running tests... 2022-11-23T02:12:26.7913587Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7913897Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7914168Z test_send_recv_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7914389Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37797 2022-11-23T02:12:26.7914612Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37798 2022-11-23T02:12:26.7914985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7915165Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7915547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7915721Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7916092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7916261Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7916636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7916827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7917070Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7917376Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7917993Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7918524Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7918741Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7919075Z STAGE:2022-11-23 02:10:53 37798:37798 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7919300Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7919690Z STAGE:2022-11-23 02:10:53 37797:37797 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7919958Z [1669169453.854460] [08317a7e7676:37798:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7920181Z [1669169455.503557] [08317a7e7676:37798:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7920409Z [1669169455.503557] [08317a7e7676:37798:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7920738Z STAGE:2022-11-23 02:10:55 37798:37798 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7921077Z STAGE:2022-11-23 02:10:55 37798:37798 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7921337Z [1669169453.854486] [08317a7e7676:37797:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7921555Z [1669169455.532995] [08317a7e7676:37797:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7921781Z [1669169455.532995] [08317a7e7676:37797:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7922109Z STAGE:2022-11-23 02:10:55 37797:37797 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7922454Z STAGE:2022-11-23 02:10:55 37797:37797 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7922552Z ok (6.777s) 2022-11-23T02:12:26.7922573Z 2022-11-23T02:12:26.7922836Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7922942Z Ran 1 test in 6.777s 2022-11-23T02:12:26.7922961Z 2022-11-23T02:12:26.7923053Z OK 2022-11-23T02:12:26.7923071Z 2022-11-23T02:12:26.7923192Z Generating XML reports... 2022-11-23T02:12:26.7923631Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021049.xml 2022-11-23T02:12:26.7924000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7924175Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7924561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7924752Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7924772Z 2022-11-23T02:12:26.7924877Z Running tests... 2022-11-23T02:12:26.7925138Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7925447Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7925690Z test_send_recv_with_tag (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7925916Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 37911 2022-11-23T02:12:26.7926135Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 37912 2022-11-23T02:12:26.7926552Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7926734Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7927117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7927307Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7927675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7927844Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7928261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7928451Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7928700Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7928948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7929353Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7929753Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7929985Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7930205Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7930486Z [1669169463.146027] [08317a7e7676:37911:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7930704Z [1669169464.569438] [08317a7e7676:37911:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7930939Z [1669169464.569438] [08317a7e7676:37911:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7931208Z [1669169463.148584] [08317a7e7676:37912:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7931432Z [1669169464.585335] [08317a7e7676:37912:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7931665Z [1669169464.585335] [08317a7e7676:37912:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7931769Z ok (6.149s) 2022-11-23T02:12:26.7931789Z 2022-11-23T02:12:26.7932055Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7932168Z Ran 1 test in 6.149s 2022-11-23T02:12:26.7932188Z 2022-11-23T02:12:26.7932279Z OK 2022-11-23T02:12:26.7932298Z 2022-11-23T02:12:26.7932407Z Generating XML reports... 2022-11-23T02:12:26.7932859Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021059.xml 2022-11-23T02:12:26.7933233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7933401Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7933779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7933970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7933992Z 2022-11-23T02:12:26.7934101Z Running tests... 2022-11-23T02:12:26.7934363Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7934660Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7934991Z test_send_recv_with_tag_autograd_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7935220Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38021 2022-11-23T02:12:26.7935436Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38022 2022-11-23T02:12:26.7935809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7935988Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7936369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7936607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7936975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7937137Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7937516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7937704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7937953Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7938203Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7938606Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7939009Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7939242Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7939580Z STAGE:2022-11-23 02:11:11 38022:38022 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7939794Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7940130Z STAGE:2022-11-23 02:11:11 38021:38021 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7940403Z [1669169471.952726] [08317a7e7676:38021:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7940633Z [1669169473.601156] [08317a7e7676:38021:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7940870Z [1669169473.601156] [08317a7e7676:38021:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7941208Z STAGE:2022-11-23 02:11:13 38021:38021 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7941480Z [1669169471.959207] [08317a7e7676:38022:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7941708Z [1669169473.604468] [08317a7e7676:38022:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7941941Z [1669169473.604468] [08317a7e7676:38022:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7942285Z STAGE:2022-11-23 02:11:13 38022:38022 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7942620Z STAGE:2022-11-23 02:11:13 38022:38022 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7942973Z STAGE:2022-11-23 02:11:13 38021:38021 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7943069Z ok (6.659s) 2022-11-23T02:12:26.7943089Z 2022-11-23T02:12:26.7943351Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7943504Z Ran 1 test in 6.659s 2022-11-23T02:12:26.7943524Z 2022-11-23T02:12:26.7943613Z OK 2022-11-23T02:12:26.7943631Z 2022-11-23T02:12:26.7943758Z Generating XML reports... 2022-11-23T02:12:26.7944206Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021107.xml 2022-11-23T02:12:26.7944562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7944738Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7945119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7945376Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7945395Z 2022-11-23T02:12:26.7945501Z Running tests... 2022-11-23T02:12:26.7945763Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7946076Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7946360Z test_send_recv_with_tag_torch_profiler (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7946583Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38135 2022-11-23T02:12:26.7946785Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38136 2022-11-23T02:12:26.7947158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7947333Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7947714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7947902Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7948270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7948443Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7948819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7949239Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7949498Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7949741Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7950153Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7950555Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7950787Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7951120Z STAGE:2022-11-23 02:11:20 38135:38135 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7951348Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7951681Z STAGE:2022-11-23 02:11:21 38136:38136 ActivityProfilerController.cpp:300] Completed Stage: Warm Up 2022-11-23T02:12:26.7951940Z [1669169481.033562] [08317a7e7676:38135:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7952172Z [1669169482.680159] [08317a7e7676:38135:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7952415Z [1669169482.680159] [08317a7e7676:38135:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7952832Z STAGE:2022-11-23 02:11:23 38135:38135 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7953108Z [1669169481.036020] [08317a7e7676:38136:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7953334Z [1669169482.653551] [08317a7e7676:38136:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7953568Z [1669169482.653551] [08317a7e7676:38136:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7953911Z STAGE:2022-11-23 02:11:23 38136:38136 ActivityProfilerController.cpp:306] Completed Stage: Collection 2022-11-23T02:12:26.7954323Z STAGE:2022-11-23 02:11:23 38135:38135 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7954668Z STAGE:2022-11-23 02:11:23 38136:38136 ActivityProfilerController.cpp:310] Completed Stage: Post Processing 2022-11-23T02:12:26.7954756Z ok (6.617s) 2022-11-23T02:12:26.7954776Z 2022-11-23T02:12:26.7955036Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7955147Z Ran 1 test in 6.617s 2022-11-23T02:12:26.7955166Z 2022-11-23T02:12:26.7955254Z OK 2022-11-23T02:12:26.7955272Z 2022-11-23T02:12:26.7955394Z Generating XML reports... 2022-11-23T02:12:26.7955844Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021117.xml 2022-11-23T02:12:26.7956218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7956390Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7956777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7956953Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7956973Z 2022-11-23T02:12:26.7957087Z Running tests... 2022-11-23T02:12:26.7957347Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7957658Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7957946Z test_sparse_all_reduce_sum (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo backend support sparse all reduce (0.002s) 2022-11-23T02:12:26.7957965Z 2022-11-23T02:12:26.7958223Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7958327Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7958346Z 2022-11-23T02:12:26.7958450Z OK (skipped=1) 2022-11-23T02:12:26.7958468Z 2022-11-23T02:12:26.7958579Z Generating XML reports... 2022-11-23T02:12:26.7959024Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021126.xml 2022-11-23T02:12:26.7959403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7959578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7959962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7960155Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7960175Z 2022-11-23T02:12:26.7960280Z Running tests... 2022-11-23T02:12:26.7960542Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7960852Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7961128Z test_sparse_all_reduce_sum_cuda (__main__.TestDistBackendWithSpawn) ... skip: Only Gloo backend support sparse all reduce (0.002s) 2022-11-23T02:12:26.7961165Z 2022-11-23T02:12:26.7961408Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7961517Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7961585Z 2022-11-23T02:12:26.7961692Z OK (skipped=1) 2022-11-23T02:12:26.7961711Z 2022-11-23T02:12:26.7961833Z Generating XML reports... 2022-11-23T02:12:26.7962286Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021128.xml 2022-11-23T02:12:26.7962658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7962830Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7963206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7963428Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7963447Z 2022-11-23T02:12:26.7963552Z Running tests... 2022-11-23T02:12:26.7963810Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7964123Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7964390Z test_stateless_api_with_ddp (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7964613Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38315 2022-11-23T02:12:26.7964831Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38316 2022-11-23T02:12:26.7965202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7965361Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7965742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7965939Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7966312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7966488Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7966867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7967052Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7967300Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7967544Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7967931Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7968338Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7968567Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7968791Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7969065Z [1669169496.428337] [08317a7e7676:38315:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7969295Z [1669169496.442115] [08317a7e7676:38315:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7969531Z [1669169496.442115] [08317a7e7676:38315:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7969801Z [1669169496.433525] [08317a7e7676:38316:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7970030Z [1669169496.446850] [08317a7e7676:38316:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7970322Z [1669169496.446850] [08317a7e7676:38316:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7970412Z ok (6.750s) 2022-11-23T02:12:26.7970432Z 2022-11-23T02:12:26.7970700Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7970808Z Ran 1 test in 6.750s 2022-11-23T02:12:26.7970827Z 2022-11-23T02:12:26.7970914Z OK 2022-11-23T02:12:26.7970933Z 2022-11-23T02:12:26.7971050Z Generating XML reports... 2022-11-23T02:12:26.7971496Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021131.xml 2022-11-23T02:12:26.7971872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7972093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7972464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7972657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7972676Z 2022-11-23T02:12:26.7972781Z Running tests... 2022-11-23T02:12:26.7973042Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7973351Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7973616Z test_static_graph_api_cpu (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7973833Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38433 2022-11-23T02:12:26.7974051Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38434 2022-11-23T02:12:26.7974426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7974585Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7974967Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7975156Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7975572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7975744Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7976123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7976311Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7976562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7976792Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7977199Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7977596Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7977829Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7978083Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp92gej2sq 2022-11-23T02:12:26.7978344Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp92gej2sq/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7978568Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7978827Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9c6042uv 2022-11-23T02:12:26.7979088Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9c6042uv/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7979396Z [1669169504.405951] [08317a7e7676:38434:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7979633Z [1669169505.829188] [08317a7e7676:38434:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7979870Z [1669169505.829188] [08317a7e7676:38434:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7980140Z [1669169504.384435] [08317a7e7676:38433:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7980407Z [1669169505.823184] [08317a7e7676:38433:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7980636Z [1669169505.823184] [08317a7e7676:38433:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7980735Z ok (6.157s) 2022-11-23T02:12:26.7980754Z 2022-11-23T02:12:26.7981025Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7981137Z Ran 1 test in 6.157s 2022-11-23T02:12:26.7981156Z 2022-11-23T02:12:26.7981247Z OK 2022-11-23T02:12:26.7981266Z 2022-11-23T02:12:26.7981372Z Generating XML reports... 2022-11-23T02:12:26.7981820Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021140.xml 2022-11-23T02:12:26.7982192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7982366Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7982754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7982946Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7982965Z 2022-11-23T02:12:26.7983071Z Running tests... 2022-11-23T02:12:26.7983336Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7983632Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7983938Z test_sync_bn_logged (__main__.TestDistBackendWithSpawn) ... skip: Only Nccl & Gloo backend support DistributedDataParallel (0.002s) 2022-11-23T02:12:26.7983957Z 2022-11-23T02:12:26.7984213Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7984321Z Ran 1 test in 0.002s 2022-11-23T02:12:26.7984340Z 2022-11-23T02:12:26.7984443Z OK (skipped=1) 2022-11-23T02:12:26.7984466Z 2022-11-23T02:12:26.7984583Z Generating XML reports... 2022-11-23T02:12:26.7985028Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021149.xml 2022-11-23T02:12:26.7985403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7985574Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7985940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7986125Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7986144Z 2022-11-23T02:12:26.7986251Z Running tests... 2022-11-23T02:12:26.7986510Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7986818Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7987112Z test_undefined_grad_parity_unused_parameters (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7987330Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38580 2022-11-23T02:12:26.7987596Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38581 2022-11-23T02:12:26.7987978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7988137Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7988517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7988704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7989288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7989549Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7989966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7990156Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7990406Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.7990636Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.7991035Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7991431Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.7991656Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.7991888Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.7992145Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3irtzllt 2022-11-23T02:12:26.7992415Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3irtzllt/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7992669Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoivi602e 2022-11-23T02:12:26.7992934Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoivi602e/_remote_module_non_scriptable.py 2022-11-23T02:12:26.7993191Z [1669169516.902664] [08317a7e7676:38581:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7993418Z [1669169516.916270] [08317a7e7676:38581:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7993651Z [1669169516.916270] [08317a7e7676:38581:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7994448Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:12:26.7994721Z [1669169516.901664] [08317a7e7676:38580:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.7994950Z [1669169516.915526] [08317a7e7676:38580:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.7995181Z [1669169516.915526] [08317a7e7676:38580:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.7996037Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:12:26.7996143Z ok (6.670s) 2022-11-23T02:12:26.7996163Z 2022-11-23T02:12:26.7996432Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7996541Z Ran 1 test in 6.671s 2022-11-23T02:12:26.7996560Z 2022-11-23T02:12:26.7996651Z OK 2022-11-23T02:12:26.7996671Z 2022-11-23T02:12:26.7996835Z Generating XML reports... 2022-11-23T02:12:26.7997273Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021151.xml 2022-11-23T02:12:26.7997645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.7997823Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.7998205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.7998396Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.7998416Z 2022-11-23T02:12:26.7998518Z Running tests... 2022-11-23T02:12:26.7998780Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.7999092Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.7999380Z test_verify_model_across_rank_with_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.7999587Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38698 2022-11-23T02:12:26.7999807Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38699 2022-11-23T02:12:26.8000181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.8000353Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.8000737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.8000928Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.8001296Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.8001470Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.8001855Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.8002028Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.8002273Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.8002518Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.8002919Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.8003314Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.8003544Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.8003774Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.8004021Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.8004259Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.8004696Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.8005100Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.8005342Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:12:26.8005578Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:12:26.8005976Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:12:26.8006413Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:12:26.8006669Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptyxhv6tg 2022-11-23T02:12:26.8006945Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptyxhv6tg/_remote_module_non_scriptable.py 2022-11-23T02:12:26.8007200Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4wyvs1i1 2022-11-23T02:12:26.8007456Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4wyvs1i1/_remote_module_non_scriptable.py 2022-11-23T02:12:26.8007732Z [1669169526.179934] [08317a7e7676:38698:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.8007957Z [1669169526.193712] [08317a7e7676:38698:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.8008193Z [1669169526.193712] [08317a7e7676:38698:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.8008506Z [1669169531.518214] [08317a7e7676:38698:1] ucc_schedule.h:189 UCC WARN timeout 5 sec. has expired on req 0x560616cbf900, seq_num 5, TL_UCP, team_id 1, size 2, rank 0, ctx_rank 0: Barrier n/a inplace=0 bytes=0 2022-11-23T02:12:26.8008782Z [1669169531.549311] [08317a7e7676:38698:0] mpool.c:55 UCX WARN object 0x560616dd0e00 {flags:0x20040 recv length 0 host memory} was not returned to mpool ucp_requests 2022-11-23T02:12:26.8009050Z [1669169526.182544] [08317a7e7676:38699:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.8009276Z [1669169526.196045] [08317a7e7676:38699:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.8009506Z [1669169526.196045] [08317a7e7676:38699:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.8009895Z [1669169531.559370] [08317a7e7676:38699:0] tag_match.c:62 UCX WARN unexpected tag-receive descriptor 0x561e85a39d40 was not matched 2022-11-23T02:12:26.8009979Z ok (11.158s) 2022-11-23T02:12:26.8010014Z 2022-11-23T02:12:26.8010268Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.8010381Z Ran 1 test in 11.158s 2022-11-23T02:12:26.8010400Z 2022-11-23T02:12:26.8010488Z OK 2022-11-23T02:12:26.8010507Z 2022-11-23T02:12:26.8010631Z Generating XML reports... 2022-11-23T02:12:26.8011080Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021200.xml 2022-11-23T02:12:26.8011452Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.8011628Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.8012014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.8012191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.8012224Z 2022-11-23T02:12:26.8012316Z Running tests... 2022-11-23T02:12:26.8012622Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.8012944Z Test results will be stored in test-reports/dist-ucc/distributed.test_distributed_spawn 2022-11-23T02:12:26.8013230Z test_verify_model_across_rank_without_logger (__main__.TestDistBackendWithSpawn) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:12:26.8013452Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 38818 2022-11-23T02:12:26.8013675Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 38819 2022-11-23T02:12:26.8014045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.8014262Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.8014632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.8014825Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.8015196Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:12:26.8015366Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:12:26.8015745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:12:26.8015934Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:12:26.8016181Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:12:26.8016423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:12:26.8016815Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.8017220Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:12:26.8017454Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:12:26.8017683Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:12:26.8017922Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:12:26.8018162Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:12:26.8018557Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.8018958Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:12:26.8019201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:12:26.8019426Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:12:26.8019818Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:12:26.8020216Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:12:26.8020474Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdxrbatzs 2022-11-23T02:12:26.8020743Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdxrbatzs/_remote_module_non_scriptable.py 2022-11-23T02:12:26.8020997Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpm1emnfi4 2022-11-23T02:12:26.8021263Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpm1emnfi4/_remote_module_non_scriptable.py 2022-11-23T02:12:26.8021579Z [1669169539.981851] [08317a7e7676:38818:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.8021816Z [1669169539.995763] [08317a7e7676:38818:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.8022053Z [1669169539.995763] [08317a7e7676:38818:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.8022355Z [1669169545.341527] [08317a7e7676:38818:1] ucc_schedule.h:189 UCC WARN timeout 5 sec. has expired on req 0x560c55b5cd40, seq_num 5, TL_UCP, team_id 1, size 2, rank 0, ctx_rank 0: Barrier n/a inplace=0 bytes=0 2022-11-23T02:12:26.8022635Z [1669169545.372595] [08317a7e7676:38818:0] mpool.c:55 UCX WARN object 0x560c55c6e140 {flags:0x20040 recv length 0 host memory} was not returned to mpool ucp_requests 2022-11-23T02:12:26.8022953Z [1669169539.985311] [08317a7e7676:38819:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:12:26.8023178Z [1669169539.998592] [08317a7e7676:38819:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:12:26.8023408Z [1669169539.998592] [08317a7e7676:38819:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:12:26.8023800Z [1669169545.382599] [08317a7e7676:38819:0] tag_match.c:62 UCX WARN unexpected tag-receive descriptor 0x5641df7c16c0 was not matched 2022-11-23T02:12:26.8023901Z ok (11.267s) 2022-11-23T02:12:26.8023921Z 2022-11-23T02:12:26.8024184Z ---------------------------------------------------------------------- 2022-11-23T02:12:26.8024298Z Ran 1 test in 11.267s 2022-11-23T02:12:26.8024317Z 2022-11-23T02:12:26.8024404Z OK 2022-11-23T02:12:26.8024423Z 2022-11-23T02:12:26.8024530Z Generating XML reports... 2022-11-23T02:12:26.8024987Z Generated XML report: test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021214.xml 2022-11-23T02:12:26.8025007Z 2022-11-23T02:12:26.8025444Z ##[endgroup] 2022-11-23T02:12:26.8025918Z FINISHED PRINTING LOG FILE of distributed/test_distributed_spawn (/var/lib/jenkins/workspace/test/test-reports/distributed-test_distributed_spawn_3vkyzowr) 2022-11-23T02:12:26.8025937Z 2022-11-23T02:12:26.8026210Z Running distributed/pipeline/sync/test_worker ... [2022-11-23 02:12:26.569691] 2022-11-23T02:12:26.8026600Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_worker.py', '-v'] ... [2022-11-23 02:12:26.569994] 2022-11-23T02:12:29.5823425Z 2022-11-23T02:12:29.5824437Z Expand the folded group to see the log file of distributed/pipeline/sync/test_worker 2022-11-23T02:12:29.5826376Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/test_worker (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_worker_gpmp1vbx) 2022-11-23T02:12:29.5827412Z ============================= test session starts ============================== 2022-11-23T02:12:29.5828424Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:12:29.5828815Z cachedir: .pytest_cache 2022-11-23T02:12:29.5830410Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:12:29.5831286Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:12:29.5831624Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:12:29.5832216Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:12:29.5832603Z collecting ... collected 6 items 2022-11-23T02:12:29.5833717Z Running 6 items in this shard: test/distributed/pipeline/sync/test_worker.py::test_compute_multithreading, test/distributed/pipeline/sync/test_worker.py::test_compute_success, test/distributed/pipeline/sync/test_worker.py::test_compute_exception, test/distributed/pipeline/sync/test_worker.py::test_grad_mode[True], test/distributed/pipeline/sync/test_worker.py::test_grad_mode[False], test/distributed/pipeline/sync/test_worker.py::test_worker_per_device 2022-11-23T02:12:29.5834465Z 2022-11-23T02:12:29.5834699Z distributed/pipeline/sync/test_worker.py::test_compute_multithreading PASSED [ 16%] 2022-11-23T02:12:29.5835161Z distributed/pipeline/sync/test_worker.py::test_compute_success PASSED [ 33%] 2022-11-23T02:12:29.5835605Z distributed/pipeline/sync/test_worker.py::test_compute_exception PASSED [ 50%] 2022-11-23T02:12:29.5836028Z distributed/pipeline/sync/test_worker.py::test_grad_mode[True] PASSED [ 66%] 2022-11-23T02:12:29.5836454Z distributed/pipeline/sync/test_worker.py::test_grad_mode[False] PASSED [ 83%] 2022-11-23T02:12:29.5836997Z distributed/pipeline/sync/test_worker.py::test_worker_per_device PASSED [100%] 2022-11-23T02:12:29.5837249Z 2022-11-23T02:12:29.5837409Z ============================== 6 passed in 0.07s =============================== 2022-11-23T02:12:29.5837587Z 2022-11-23T02:12:29.5837920Z ##[endgroup] 2022-11-23T02:12:29.5838571Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/test_worker (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_worker_gpmp1vbx) 2022-11-23T02:12:29.5838956Z 2022-11-23T02:12:29.5839246Z Running distributed/pipeline/sync/test_pipeline ... [2022-11-23 02:12:29.582429] 2022-11-23T02:12:29.5839859Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_pipeline.py', '-v'] ... [2022-11-23 02:12:29.582705] 2022-11-23T02:12:32.0518913Z 2022-11-23T02:12:32.0519622Z Expand the folded group to see the log file of distributed/pipeline/sync/test_pipeline 2022-11-23T02:12:32.0520937Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/test_pipeline (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_pipeline_pcxx3zvw) 2022-11-23T02:12:32.0521563Z ============================= test session starts ============================== 2022-11-23T02:12:32.0522191Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:12:32.0522553Z cachedir: .pytest_cache 2022-11-23T02:12:32.0523113Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:12:32.0523764Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:12:32.0524100Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:12:32.0524675Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:12:32.0525096Z collecting ... collected 1 item 2022-11-23T02:12:32.0525513Z Running 1 items in this shard: test/distributed/pipeline/sync/test_pipeline.py::test_clock_cycles 2022-11-23T02:12:32.0525799Z 2022-11-23T02:12:32.0526017Z distributed/pipeline/sync/test_pipeline.py::test_clock_cycles PASSED [100%] 2022-11-23T02:12:32.0526253Z 2022-11-23T02:12:32.0526418Z ============================== 1 passed in 0.03s =============================== 2022-11-23T02:12:32.0526621Z 2022-11-23T02:12:32.0526944Z ##[endgroup] 2022-11-23T02:12:32.0527601Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/test_pipeline (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_pipeline_pcxx3zvw) 2022-11-23T02:12:32.0528000Z 2022-11-23T02:12:32.0528291Z Running distributed/pipeline/sync/test_microbatch ... [2022-11-23 02:12:32.051985] 2022-11-23T02:12:32.0528921Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_microbatch.py', '-v'] ... [2022-11-23 02:12:32.052263] 2022-11-23T02:12:34.5418522Z 2022-11-23T02:12:34.5419026Z Expand the folded group to see the log file of distributed/pipeline/sync/test_microbatch 2022-11-23T02:12:34.5420319Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/test_microbatch (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_microbatch_cysk53wy) 2022-11-23T02:12:34.5420878Z ============================= test session starts ============================== 2022-11-23T02:12:34.5421724Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:12:34.5422088Z cachedir: .pytest_cache 2022-11-23T02:12:34.5422678Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:12:34.5423166Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:12:34.5423502Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:12:34.5424062Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:12:34.5424469Z collecting ... collected 10 items 2022-11-23T02:12:34.5425884Z Running 10 items in this shard: test/distributed/pipeline/sync/test_microbatch.py::test_batch_atomic, test/distributed/pipeline/sync/test_microbatch.py::test_batch_non_atomic, test/distributed/pipeline/sync/test_microbatch.py::test_batch_call, test/distributed/pipeline/sync/test_microbatch.py::test_batch_setitem_by_index, test/distributed/pipeline/sync/test_microbatch.py::test_batch_setitem_by_slice, test/distributed/pipeline/sync/test_microbatch.py::test_check, test/distributed/pipeline/sync/test_microbatch.py::test_gather_tensors, test/distributed/pipeline/sync/test_microbatch.py::test_gather_tuples, test/distributed/pipeline/sync/test_microbatch.py::test_scatter_tensor, test/distributed/pipeline/sync/test_microbatch.py::test_scatter_multiple_tensors 2022-11-23T02:12:34.5426989Z 2022-11-23T02:12:34.5427211Z distributed/pipeline/sync/test_microbatch.py::test_batch_atomic PASSED [ 10%] 2022-11-23T02:12:34.5427670Z distributed/pipeline/sync/test_microbatch.py::test_batch_non_atomic PASSED [ 20%] 2022-11-23T02:12:34.5428128Z distributed/pipeline/sync/test_microbatch.py::test_batch_call PASSED [ 30%] 2022-11-23T02:12:34.5428566Z distributed/pipeline/sync/test_microbatch.py::test_batch_setitem_by_index PASSED [ 40%] 2022-11-23T02:12:34.5429754Z distributed/pipeline/sync/test_microbatch.py::test_batch_setitem_by_slice PASSED [ 50%] 2022-11-23T02:12:34.5430234Z distributed/pipeline/sync/test_microbatch.py::test_check PASSED [ 60%] 2022-11-23T02:12:34.5430675Z distributed/pipeline/sync/test_microbatch.py::test_gather_tensors PASSED [ 70%] 2022-11-23T02:12:34.5431104Z distributed/pipeline/sync/test_microbatch.py::test_gather_tuples PASSED [ 80%] 2022-11-23T02:12:34.5431544Z distributed/pipeline/sync/test_microbatch.py::test_scatter_tensor PASSED [ 90%] 2022-11-23T02:12:34.5432011Z distributed/pipeline/sync/test_microbatch.py::test_scatter_multiple_tensors PASSED [100%] 2022-11-23T02:12:34.5432286Z 2022-11-23T02:12:34.5432428Z ============================== 10 passed in 0.08s ============================== 2022-11-23T02:12:34.5432632Z 2022-11-23T02:12:34.5432954Z ##[endgroup] 2022-11-23T02:12:34.5433637Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/test_microbatch (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_microbatch_cysk53wy) 2022-11-23T02:12:34.5434041Z 2022-11-23T02:12:34.5434357Z Running distributed/pipeline/sync/test_deferred_batch_norm ... [2022-11-23 02:12:34.541974] 2022-11-23T02:12:34.5434998Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_deferred_batch_norm.py', '-v'] ... [2022-11-23 02:12:34.542244] 2022-11-23T02:12:37.5886694Z 2022-11-23T02:12:37.5887567Z Expand the folded group to see the log file of distributed/pipeline/sync/test_deferred_batch_norm 2022-11-23T02:12:37.5888924Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/test_deferred_batch_norm (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_deferred_batch_norm_8pqms88l) 2022-11-23T02:12:37.5889717Z ============================= test session starts ============================== 2022-11-23T02:12:37.5890585Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:12:37.5890945Z cachedir: .pytest_cache 2022-11-23T02:12:37.5891927Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:12:37.5892400Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:12:37.5892938Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:12:37.5893538Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:12:37.5893945Z collecting ... collected 11 items 2022-11-23T02:12:37.5896141Z Running 11 items in this shard: test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[True-1], test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[True-4], test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[False-1], test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[False-4], test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_running_stats[0.1], test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_running_stats[None], test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_convert_deferred_batch_norm, test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_eval, test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_optimize, test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_conv_bn, test/distributed/pipeline/sync/test_deferred_batch_norm.py::test_input_requiring_grad 2022-11-23T02:12:37.5897587Z 2022-11-23T02:12:37.5897918Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[True-1] PASSED [ 9%] 2022-11-23T02:12:37.5898503Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[True-4] PASSED [ 18%] 2022-11-23T02:12:37.5899082Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[False-1] PASSED [ 27%] 2022-11-23T02:12:37.5899674Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_transparency[False-4] PASSED [ 36%] 2022-11-23T02:12:37.5900147Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_running_stats[0.1] PASSED [ 45%] 2022-11-23T02:12:37.5900625Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_running_stats[None] PASSED [ 54%] 2022-11-23T02:12:37.5901125Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_convert_deferred_batch_norm PASSED [ 63%] 2022-11-23T02:12:37.5901583Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_eval PASSED [ 72%] 2022-11-23T02:12:37.5902036Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_optimize PASSED [ 81%] 2022-11-23T02:12:37.5902489Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_conv_bn PASSED [ 90%] 2022-11-23T02:12:37.5902960Z distributed/pipeline/sync/test_deferred_batch_norm.py::test_input_requiring_grad PASSED [100%] 2022-11-23T02:12:37.5903239Z 2022-11-23T02:12:37.5903381Z ============================== 11 passed in 0.63s ============================== 2022-11-23T02:12:37.5903575Z 2022-11-23T02:12:37.5903900Z ##[endgroup] 2022-11-23T02:12:37.5904601Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/test_deferred_batch_norm (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_deferred_batch_norm_8pqms88l) 2022-11-23T02:12:37.5905023Z 2022-11-23T02:12:37.5905312Z Running distributed/pipeline/sync/test_bugs ... [2022-11-23 02:12:37.588780] 2022-11-23T02:12:37.5905902Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/test_bugs.py', '-v'] ... [2022-11-23 02:12:37.589068] 2022-11-23T02:12:43.9896919Z 2022-11-23T02:12:43.9897670Z Expand the folded group to see the log file of distributed/pipeline/sync/test_bugs 2022-11-23T02:12:43.9898918Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/test_bugs (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_bugs_x51fmxcj) 2022-11-23T02:12:43.9899462Z ============================= test session starts ============================== 2022-11-23T02:12:43.9900113Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:12:43.9900478Z cachedir: .pytest_cache 2022-11-23T02:12:43.9901300Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:12:43.9901759Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:12:43.9902092Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:12:43.9902680Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:12:43.9903073Z collecting ... collected 4 items 2022-11-23T02:12:43.9903764Z Running 4 items in this shard: test/distributed/pipeline/sync/test_bugs.py::test_python_autograd_function, test/distributed/pipeline/sync/test_bugs.py::test_exception_no_hang, test/distributed/pipeline/sync/test_bugs.py::test_tuple_wait, test/distributed/pipeline/sync/test_bugs.py::test_parallel_randoms 2022-11-23T02:12:43.9904414Z 2022-11-23T02:12:43.9904644Z distributed/pipeline/sync/test_bugs.py::test_python_autograd_function PASSED [ 25%] 2022-11-23T02:12:43.9905108Z distributed/pipeline/sync/test_bugs.py::test_exception_no_hang PASSED [ 50%] 2022-11-23T02:12:43.9905544Z distributed/pipeline/sync/test_bugs.py::test_tuple_wait PASSED [ 75%] 2022-11-23T02:12:43.9905960Z distributed/pipeline/sync/test_bugs.py::test_parallel_randoms PASSED [100%] 2022-11-23T02:12:43.9906207Z 2022-11-23T02:12:43.9906365Z ============================== 4 passed in 3.87s =============================== 2022-11-23T02:12:43.9906562Z 2022-11-23T02:12:43.9906878Z ##[endgroup] 2022-11-23T02:12:43.9907496Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/test_bugs (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-test_bugs_x51fmxcj) 2022-11-23T02:12:43.9907885Z 2022-11-23T02:12:43.9908191Z Running distributed/pipeline/sync/skip/test_tracker ... [2022-11-23 02:12:43.989688] 2022-11-23T02:12:43.9908824Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_tracker.py', '-v'] ... [2022-11-23 02:12:43.989996] 2022-11-23T02:12:47.8573143Z 2022-11-23T02:12:47.8573960Z Expand the folded group to see the log file of distributed/pipeline/sync/skip/test_tracker 2022-11-23T02:12:47.8575332Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/skip/test_tracker (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-skip-test_tracker_u_x5fzsk) 2022-11-23T02:12:47.8576193Z ============================= test session starts ============================== 2022-11-23T02:12:47.8576812Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:12:47.8577388Z cachedir: .pytest_cache 2022-11-23T02:12:47.8577990Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:12:47.8578620Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:12:47.8579053Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:12:47.8579642Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:12:47.8580305Z collecting ... collected 6 items 2022-11-23T02:12:47.8581552Z Running 6 items in this shard: test/distributed/pipeline/sync/skip/test_tracker.py::test_default_skip_tracker, test/distributed/pipeline/sync/skip/test_tracker.py::test_default_skip_tracker_by_data_parallel, test/distributed/pipeline/sync/skip/test_tracker.py::test_reuse_portal, test/distributed/pipeline/sync/skip/test_tracker.py::test_no_copy_no_portal, test/distributed/pipeline/sync/skip/test_tracker.py::test_tensor_life_without_checkpointing, test/distributed/pipeline/sync/skip/test_tracker.py::test_tensor_life_with_checkpointing 2022-11-23T02:12:47.8582435Z 2022-11-23T02:12:47.8582881Z distributed/pipeline/sync/skip/test_tracker.py::test_default_skip_tracker PASSED [ 16%] 2022-11-23T02:12:47.8583404Z distributed/pipeline/sync/skip/test_tracker.py::test_default_skip_tracker_by_data_parallel PASSED [ 33%] 2022-11-23T02:12:47.8584336Z distributed/pipeline/sync/skip/test_tracker.py::test_reuse_portal PASSED [ 50%] 2022-11-23T02:12:47.8584831Z distributed/pipeline/sync/skip/test_tracker.py::test_no_copy_no_portal PASSED [ 66%] 2022-11-23T02:12:47.8585568Z distributed/pipeline/sync/skip/test_tracker.py::test_tensor_life_without_checkpointing PASSED [ 83%] 2022-11-23T02:12:47.8586112Z distributed/pipeline/sync/skip/test_tracker.py::test_tensor_life_with_checkpointing PASSED [100%] 2022-11-23T02:12:47.8586396Z 2022-11-23T02:12:47.8586650Z ============================== 6 passed in 1.40s =============================== 2022-11-23T02:12:47.8586976Z 2022-11-23T02:12:47.8587322Z ##[endgroup] 2022-11-23T02:12:47.8588212Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/skip/test_tracker (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-skip-test_tracker_u_x5fzsk) 2022-11-23T02:12:47.8588860Z 2022-11-23T02:12:47.8589760Z Running distributed/pipeline/sync/skip/test_leak ... [2022-11-23 02:12:47.857353] 2022-11-23T02:12:47.8590409Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_leak.py', '-v'] ... [2022-11-23 02:12:47.857626] 2022-11-23T02:12:50.5073422Z 2022-11-23T02:12:50.5073940Z Expand the folded group to see the log file of distributed/pipeline/sync/skip/test_leak 2022-11-23T02:12:50.5074964Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/skip/test_leak (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-skip-test_leak_zgqdv90i) 2022-11-23T02:12:50.5075514Z ============================= test session starts ============================== 2022-11-23T02:12:50.5076125Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:12:50.5076486Z cachedir: .pytest_cache 2022-11-23T02:12:50.5077087Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:12:50.5077544Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:12:50.5077855Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:12:50.5078440Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:12:50.5078846Z collecting ... collected 8 items 2022-11-23T02:12:50.5080640Z Running 8 items in this shard: test/distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[always-train], test/distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[always-eval], test/distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[except_last-train], test/distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[except_last-eval], test/distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[never-train], test/distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[never-eval], test/distributed/pipeline/sync/skip/test_leak.py::test_no_portal_without_pipe[train], test/distributed/pipeline/sync/skip/test_leak.py::test_no_portal_without_pipe[eval] 2022-11-23T02:12:50.5081711Z 2022-11-23T02:12:50.5082062Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[always-train] PASSED [ 12%] 2022-11-23T02:12:50.5082643Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[always-eval] PASSED [ 25%] 2022-11-23T02:12:50.5083278Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[except_last-train] PASSED [ 37%] 2022-11-23T02:12:50.5083890Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[except_last-eval] PASSED [ 50%] 2022-11-23T02:12:50.5084488Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[never-train] PASSED [ 62%] 2022-11-23T02:12:50.5085057Z distributed/pipeline/sync/skip/test_leak.py::test_delete_portal_tensor[never-eval] PASSED [ 75%] 2022-11-23T02:12:50.5085548Z distributed/pipeline/sync/skip/test_leak.py::test_no_portal_without_pipe[train] PASSED [ 87%] 2022-11-23T02:12:50.5086037Z distributed/pipeline/sync/skip/test_leak.py::test_no_portal_without_pipe[eval] PASSED [100%] 2022-11-23T02:12:50.5086306Z 2022-11-23T02:12:50.5086670Z ============================== 8 passed in 0.29s =============================== 2022-11-23T02:12:50.5086889Z 2022-11-23T02:12:50.5088895Z ##[endgroup] 2022-11-23T02:12:50.5089607Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/skip/test_leak (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-skip-test_leak_zgqdv90i) 2022-11-23T02:12:50.5090009Z 2022-11-23T02:12:50.5090293Z Running distributed/pipeline/sync/skip/test_api ... [2022-11-23 02:12:50.507437] 2022-11-23T02:12:50.5090912Z Executing ['/opt/conda/bin/python', '-bb', '-m', 'pytest', 'distributed/pipeline/sync/skip/test_api.py', '-v'] ... [2022-11-23 02:12:50.507724] 2022-11-23T02:12:52.9556252Z 2022-11-23T02:12:52.9557014Z Expand the folded group to see the log file of distributed/pipeline/sync/skip/test_api 2022-11-23T02:12:52.9558760Z ##[group]PRINTING LOG FILE of distributed/pipeline/sync/skip/test_api (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-skip-test_api_svjkjxub) 2022-11-23T02:12:52.9559760Z ============================= test session starts ============================== 2022-11-23T02:12:52.9560807Z platform linux -- Python 3.10.4, pytest-7.2.0, pluggy-1.0.0 -- /opt/conda/bin/python 2022-11-23T02:12:52.9561434Z cachedir: .pytest_cache 2022-11-23T02:12:52.9562535Z hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/var/lib/jenkins/workspace/test/.hypothesis/examples') 2022-11-23T02:12:52.9563350Z torch: 1.14.0a0+git1cfd385 2022-11-23T02:12:52.9564105Z rootdir: /var/lib/jenkins/workspace, configfile: pytest.ini 2022-11-23T02:12:52.9565160Z plugins: hypothesis-5.35.1, flakefinder-1.1.0, rerunfailures-10.3, shard-0.1.2, xdist-3.0.2, xdoctest-1.0.2 2022-11-23T02:12:52.9565913Z collecting ... collected 3 items 2022-11-23T02:12:52.9566988Z Running 3 items in this shard: test/distributed/pipeline/sync/skip/test_api.py::test_namespace_difference, test/distributed/pipeline/sync/skip/test_api.py::test_namespace_copy, test/distributed/pipeline/sync/skip/test_api.py::test_skippable_repr 2022-11-23T02:12:52.9567886Z 2022-11-23T02:12:52.9568315Z distributed/pipeline/sync/skip/test_api.py::test_namespace_difference PASSED [ 33%] 2022-11-23T02:12:52.9569211Z distributed/pipeline/sync/skip/test_api.py::test_namespace_copy PASSED [ 66%] 2022-11-23T02:12:52.9570115Z distributed/pipeline/sync/skip/test_api.py::test_skippable_repr PASSED [100%] 2022-11-23T02:12:52.9570625Z 2022-11-23T02:12:52.9570941Z ============================== 3 passed in 0.05s =============================== 2022-11-23T02:12:52.9571313Z 2022-11-23T02:12:52.9571903Z ##[endgroup] 2022-11-23T02:12:52.9573177Z FINISHED PRINTING LOG FILE of distributed/pipeline/sync/skip/test_api (/var/lib/jenkins/workspace/test/test-reports/distributed-pipeline-sync-skip-test_api_svjkjxub) 2022-11-23T02:12:52.9573961Z 2022-11-23T02:12:52.9574551Z Running distributed/elastic/timer/api_test ... [2022-11-23 02:12:52.955781] 2022-11-23T02:12:52.9575930Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/elastic/timer/api_test.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:12:52.956138] 2022-11-23T02:12:54.7696809Z 2022-11-23T02:12:54.7697338Z Expand the folded group to see the log file of distributed/elastic/timer/api_test 2022-11-23T02:12:54.7698327Z ##[group]PRINTING LOG FILE of distributed/elastic/timer/api_test (/var/lib/jenkins/workspace/test/test-reports/distributed-elastic-timer-api_test__7xu45ur) 2022-11-23T02:12:54.7698710Z 2022-11-23T02:12:54.7699012Z ##[endgroup] 2022-11-23T02:12:54.7699742Z FINISHED PRINTING LOG FILE of distributed/elastic/timer/api_test (/var/lib/jenkins/workspace/test/test-reports/distributed-elastic-timer-api_test__7xu45ur) 2022-11-23T02:12:54.7700130Z 2022-11-23T02:12:54.7700663Z Running distributed/checkpoint/test_dedup_tensors ... [2022-11-23 02:12:54.769785] 2022-11-23T02:12:54.7704109Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/checkpoint/test_dedup_tensors.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:12:54.770084] 2022-11-23T02:12:58.7572512Z 2022-11-23T02:12:58.7573243Z Expand the folded group to see the log file of distributed/checkpoint/test_dedup_tensors 2022-11-23T02:12:58.7574329Z ##[group]PRINTING LOG FILE of distributed/checkpoint/test_dedup_tensors (/var/lib/jenkins/workspace/test/test-reports/distributed-checkpoint-test_dedup_tensors_l69j0dem) 2022-11-23T02:12:58.7574759Z 2022-11-23T02:12:58.7574874Z Running tests... 2022-11-23T02:12:58.7575375Z ---------------------------------------------------------------------- 2022-11-23T02:12:58.7576218Z Test results will be stored in test-reports/python-unittest/distributed.checkpoint.test_dedup_tensors 2022-11-23T02:12:58.7576982Z test_dedup_shards (__main__.TestDedupTensor) ... ok (1.664s) 2022-11-23T02:12:58.7577380Z 2022-11-23T02:12:58.7577777Z ---------------------------------------------------------------------- 2022-11-23T02:12:58.7578120Z Ran 1 test in 1.664s 2022-11-23T02:12:58.7578290Z 2022-11-23T02:12:58.7578384Z OK 2022-11-23T02:12:58.7578530Z 2022-11-23T02:12:58.7578660Z Generating XML reports... 2022-11-23T02:12:58.7579351Z Generated XML report: test-reports/python-unittest/distributed.checkpoint.test_dedup_tensors/TEST-TestDedupTensor-20221123021256.xml 2022-11-23T02:12:58.7579884Z 2022-11-23T02:12:58.7580220Z ##[endgroup] 2022-11-23T02:12:58.7581181Z FINISHED PRINTING LOG FILE of distributed/checkpoint/test_dedup_tensors (/var/lib/jenkins/workspace/test/test-reports/distributed-checkpoint-test_dedup_tensors_l69j0dem) 2022-11-23T02:12:58.7581590Z 2022-11-23T02:12:58.7581888Z Running distributed/_shard/sharded_tensor/ops/test_math_ops ... [2022-11-23 02:12:58.757321] 2022-11-23T02:12:58.7582632Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_math_ops.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:12:58.757711] 2022-11-23T02:13:00.8773294Z 2022-11-23T02:13:00.8773816Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_math_ops 2022-11-23T02:13:00.8774815Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_math_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_math_ops_1p19wdoe) 2022-11-23T02:13:00.8775236Z 2022-11-23T02:13:00.8775533Z ##[endgroup] 2022-11-23T02:13:00.8776319Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_math_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_math_ops_1p19wdoe) 2022-11-23T02:13:00.8776728Z 2022-11-23T02:13:00.8777218Z Running distributed/_composable/test_checkpoint ... [2022-11-23 02:13:00.877423] 2022-11-23T02:13:00.8780288Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_composable/test_checkpoint.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:13:00.877731] 2022-11-23T02:13:05.3286706Z 2022-11-23T02:13:05.3287509Z Expand the folded group to see the log file of distributed/_composable/test_checkpoint 2022-11-23T02:13:05.3289428Z ##[group]PRINTING LOG FILE of distributed/_composable/test_checkpoint (/var/lib/jenkins/workspace/test/test-reports/distributed-_composable-test_checkpoint_uzdhdjis) 2022-11-23T02:13:05.3289986Z 2022-11-23T02:13:05.3290104Z Running tests... 2022-11-23T02:13:05.3290873Z ---------------------------------------------------------------------- 2022-11-23T02:13:05.3291993Z Test results will be stored in test-reports/python-unittest/distributed._composable.test_checkpoint 2022-11-23T02:13:05.3292469Z test_tensor_only_cpu (__main__.TestCheckpoint) ... ok (0.035s) 2022-11-23T02:13:05.3292830Z test_tensor_only_gpu (__main__.TestCheckpoint) ... ok (0.416s) 2022-11-23T02:13:05.3293200Z 2022-11-23T02:13:05.3293773Z ---------------------------------------------------------------------- 2022-11-23T02:13:05.3294471Z Ran 2 tests in 0.452s 2022-11-23T02:13:05.3294644Z 2022-11-23T02:13:05.3294738Z OK 2022-11-23T02:13:05.3294874Z 2022-11-23T02:13:05.3294980Z Generating XML reports... 2022-11-23T02:13:05.3295868Z Generated XML report: test-reports/python-unittest/distributed._composable.test_checkpoint/TEST-TestCheckpoint-20221123021304.xml 2022-11-23T02:13:05.3296252Z 2022-11-23T02:13:05.3296576Z ##[endgroup] 2022-11-23T02:13:05.3297195Z FINISHED PRINTING LOG FILE of distributed/_composable/test_checkpoint (/var/lib/jenkins/workspace/test/test-reports/distributed-_composable-test_checkpoint_uzdhdjis) 2022-11-23T02:13:05.3297576Z 2022-11-23T02:13:05.3297852Z Running distributed/checkpoint/test_utils ... [2022-11-23 02:13:05.328743] 2022-11-23T02:13:05.3298541Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/checkpoint/test_utils.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:13:05.329049] 2022-11-23T02:13:09.2954455Z 2022-11-23T02:13:09.2955225Z Expand the folded group to see the log file of distributed/checkpoint/test_utils 2022-11-23T02:13:09.2956982Z ##[group]PRINTING LOG FILE of distributed/checkpoint/test_utils (/var/lib/jenkins/workspace/test/test-reports/distributed-checkpoint-test_utils_z3wrze1w) 2022-11-23T02:13:09.2957354Z 2022-11-23T02:13:09.2957470Z Running tests... 2022-11-23T02:13:09.2957984Z ---------------------------------------------------------------------- 2022-11-23T02:13:09.2958817Z Test results will be stored in test-reports/python-unittest/distributed.checkpoint.test_utils 2022-11-23T02:13:09.2959793Z test_flat_data (__main__.TestMedatadaIndex) ... ok (1.640s) 2022-11-23T02:13:09.2960302Z test_index_hint_ignored_on_equals (__main__.TestMedatadaIndex) ... ok (0.001s) 2022-11-23T02:13:09.2960726Z test_index_hint_ignored_on_hash (__main__.TestMedatadaIndex) ... ok (0.001s) 2022-11-23T02:13:09.2961264Z test_init_convert_offset (__main__.TestMedatadaIndex) ... ok (0.001s) 2022-11-23T02:13:09.2962130Z test_sharded_tensor_lookup (__main__.TestMedatadaIndex) ... ok (0.003s) 2022-11-23T02:13:09.2962583Z 2022-11-23T02:13:09.2962869Z ---------------------------------------------------------------------- 2022-11-23T02:13:09.2963208Z Ran 5 tests in 1.646s 2022-11-23T02:13:09.2963375Z 2022-11-23T02:13:09.2963450Z OK 2022-11-23T02:13:09.2963583Z 2022-11-23T02:13:09.2963706Z Generating XML reports... 2022-11-23T02:13:09.2964328Z Generated XML report: test-reports/python-unittest/distributed.checkpoint.test_utils/TEST-TestMedatadaIndex-20221123021307.xml 2022-11-23T02:13:09.2964693Z 2022-11-23T02:13:09.2964990Z ##[endgroup] 2022-11-23T02:13:09.2965606Z FINISHED PRINTING LOG FILE of distributed/checkpoint/test_utils (/var/lib/jenkins/workspace/test/test-reports/distributed-checkpoint-test_utils_z3wrze1w) 2022-11-23T02:13:09.2965972Z 2022-11-23T02:13:09.2966241Z Running distributed/fsdp/test_utils ... [2022-11-23 02:13:09.295529] 2022-11-23T02:13:09.2966914Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_utils.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:13:09.295874] 2022-11-23T02:13:13.1922107Z 2022-11-23T02:13:13.1923132Z Expand the folded group to see the log file of distributed/fsdp/test_utils 2022-11-23T02:13:13.1924820Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_utils (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_utils_p6zt3gw8) 2022-11-23T02:13:13.1925241Z 2022-11-23T02:13:13.1925354Z Running tests... 2022-11-23T02:13:13.1925865Z ---------------------------------------------------------------------- 2022-11-23T02:13:13.1926403Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_utils 2022-11-23T02:13:13.1926861Z test_module_wrap_policy (__main__.TestGetSubmoduleToStates) 2022-11-23T02:13:13.1927289Z Tests the module wrap policy on a nested model with buffers and a ... ok (1.618s) 2022-11-23T02:13:13.1927692Z test_apply_to_tensors_cpu_cuda (__main__.TestUtils) ... ok (0.004s) 2022-11-23T02:13:13.1928181Z test_apply_to_tensors_devices_['cpu'] (__main__.TestUtils) ... ok (0.003s) 2022-11-23T02:13:13.1928674Z test_apply_to_tensors_devices_['cuda'] (__main__.TestUtils) ... ok (0.003s) 2022-11-23T02:13:13.1929270Z test_packed_sequence (__main__.TestUtils) 2022-11-23T02:13:13.1929653Z Test to ensure RNN packed sequences are modified correctly. ... ok (0.002s) 2022-11-23T02:13:13.1930045Z test_replace_by_prefix (__main__.TestUtils) ... ok (0.001s) 2022-11-23T02:13:13.1930260Z 2022-11-23T02:13:13.1930532Z ---------------------------------------------------------------------- 2022-11-23T02:13:13.1930848Z Ran 6 tests in 1.633s 2022-11-23T02:13:13.1931011Z 2022-11-23T02:13:13.1931106Z OK 2022-11-23T02:13:13.1931240Z 2022-11-23T02:13:13.1931363Z Generating XML reports... 2022-11-23T02:13:13.1931963Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_utils/TEST-TestGetSubmoduleToStates-20221123021311.xml 2022-11-23T02:13:13.1932822Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_utils/TEST-TestUtils-20221123021311.xml 2022-11-23T02:13:13.1933152Z 2022-11-23T02:13:13.1933465Z ##[endgroup] 2022-11-23T02:13:13.1934047Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_utils (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_utils_p6zt3gw8) 2022-11-23T02:13:13.1934374Z 2022-11-23T02:13:13.1934686Z Running distributed/_shard/sharded_optim/test_sharded_optim ... [2022-11-23 02:13:13.192244] 2022-11-23T02:13:13.1935427Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_optim/test_sharded_optim.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:13:13.192617] 2022-11-23T02:13:19.5338859Z 2022-11-23T02:13:19.5339398Z Expand the folded group to see the log file of distributed/_shard/sharded_optim/test_sharded_optim 2022-11-23T02:13:19.5340415Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_optim/test_sharded_optim (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_optim-test_sharded_optim_evz93rpy) 2022-11-23T02:13:19.5340869Z 2022-11-23T02:13:19.5340982Z Running tests... 2022-11-23T02:13:19.5341504Z ---------------------------------------------------------------------- 2022-11-23T02:13:19.5342118Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_optim.test_sharded_optim 2022-11-23T02:13:19.5342667Z test_named_params_with_sharded_tensor (__main__.TestShardedOptimizer) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:13:19.5343732Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82023 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.641s) 2022-11-23T02:13:19.5344554Z test_sharded_optim (__main__.TestShardedOptimizer) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 39990 2022-11-23T02:13:19.5345093Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 39991 2022-11-23T02:13:19.5345552Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 39992 2022-11-23T02:13:19.5346297Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 39993 2022-11-23T02:13:19.5347518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:19.5348379Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:19.5349755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:19.5350632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:19.5351799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:19.5352744Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:19.5353875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:19.5355131Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:19.5356311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:19.5357182Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:19.5358331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:19.5359255Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:19.5360387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:19.5361227Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:19.5362580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:19.5363471Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:19.5364314Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:19.5365200Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:13:19.5366126Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:13:19.5367039Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:19.5367816Z skip: Need at least 4 CUDA devices (2.412s) 2022-11-23T02:13:19.5368179Z 2022-11-23T02:13:19.5368724Z ---------------------------------------------------------------------- 2022-11-23T02:13:19.5369352Z Ran 2 tests in 4.053s 2022-11-23T02:13:19.5369674Z 2022-11-23T02:13:19.5369872Z OK (skipped=2) 2022-11-23T02:13:19.5370164Z 2022-11-23T02:13:19.5370372Z Generating XML reports... 2022-11-23T02:13:19.5371700Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_optim.test_sharded_optim/TEST-TestShardedOptimizer-20221123021315.xml 2022-11-23T02:13:19.5372476Z 2022-11-23T02:13:19.5373077Z ##[endgroup] 2022-11-23T02:13:19.5374426Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_optim/test_sharded_optim (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_optim-test_sharded_optim_evz93rpy) 2022-11-23T02:13:19.5375248Z 2022-11-23T02:13:19.5375782Z Running distributed/test_data_parallel ... [2022-11-23 02:13:19.533935] 2022-11-23T02:13:19.5377126Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_data_parallel.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:13:19.534347] 2022-11-23T02:13:28.6789276Z 2022-11-23T02:13:28.6790562Z Expand the folded group to see the log file of distributed/test_data_parallel 2022-11-23T02:13:28.6792128Z ##[group]PRINTING LOG FILE of distributed/test_data_parallel (/var/lib/jenkins/workspace/test/test-reports/distributed-test_data_parallel_72136i1x) 2022-11-23T02:13:28.6792772Z 2022-11-23T02:13:28.6792975Z Running tests... 2022-11-23T02:13:28.6793874Z ---------------------------------------------------------------------- 2022-11-23T02:13:28.6794887Z Test results will be stored in test-reports/python-unittest/distributed.test_data_parallel 2022-11-23T02:13:28.6795725Z test_autocast (__main__.TestDataParallel) ... ok (1.892s) 2022-11-23T02:13:28.6796408Z test_data_parallel (__main__.TestDataParallel) ... ok (0.092s) 2022-11-23T02:13:28.6797134Z test_data_parallel_buffers_requiring_grad (__main__.TestDataParallel) ... ok (0.013s) 2022-11-23T02:13:28.6797719Z test_data_parallel_complex (__main__.TestDataParallel) ... ok (0.048s) 2022-11-23T02:13:28.6798144Z test_data_parallel_device_args (__main__.TestDataParallel) ... ok (0.006s) 2022-11-23T02:13:28.6798571Z test_data_parallel_function_deletion (__main__.TestDataParallel) ... ok (0.006s) 2022-11-23T02:13:28.6799845Z test_data_parallel_lazy_linear (__main__.TestDataParallel) ... /opt/conda/lib/python3.10/site-packages/torch/nn/modules/lazy.py:180: UserWarning: Lazy modules are a new feature under heavy development so changes to the API or functionality can happen at any moment. 2022-11-23T02:13:28.6801005Z warnings.warn('Lazy modules are a new feature under heavy development ' 2022-11-23T02:13:28.6801341Z ok (0.002s) 2022-11-23T02:13:28.6801661Z test_data_parallel_model_device (__main__.TestDataParallel) 2022-11-23T02:13:28.6802013Z Test device[0] check at forward time. ... ok (0.035s) 2022-11-23T02:13:28.6802409Z test_data_parallel_model_no_refcycles (__main__.TestDataParallel) ... ok (0.105s) 2022-11-23T02:13:28.6802851Z test_data_parallel_module_zero_inputs (__main__.TestDataParallel) ... ok (0.005s) 2022-11-23T02:13:28.6803755Z test_data_parallel_multiple_input (__main__.TestDataParallel) ... /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/comm.py:231: UserWarning: Using -1 to represent CPU tensor is deprecated. Please use a device object or string instead, e.g., "cpu". 2022-11-23T02:13:28.6804454Z warnings.warn( 2022-11-23T02:13:28.6804697Z ok (0.023s) 2022-11-23T02:13:28.6805034Z test_data_parallel_nested_input (__main__.TestDataParallel) ... ok (0.003s) 2022-11-23T02:13:28.6805446Z test_data_parallel_nested_output (__main__.TestDataParallel) ... ok (0.006s) 2022-11-23T02:13:28.6805865Z test_data_parallel_no_grad (__main__.TestDataParallel) ... ok (0.004s) 2022-11-23T02:13:28.6806268Z test_data_parallel_rnn (__main__.TestDataParallel) ... ok (1.003s) 2022-11-23T02:13:28.6806676Z test_data_parallel_small_back (__main__.TestDataParallel) ... ok (0.004s) 2022-11-23T02:13:28.6807071Z test_data_parallel_sparse (__main__.TestDataParallel) ... ok (0.012s) 2022-11-23T02:13:28.6807950Z test_gather_cpu (__main__.TestDataParallel) ... /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/_functions.py:68: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. 2022-11-23T02:13:28.6808662Z warnings.warn('Was asked to gather along dimension 0, but all ' 2022-11-23T02:13:28.6808981Z ok (0.044s) 2022-11-23T02:13:28.6809293Z test_gather_different_len_dicts (__main__.TestDataParallel) ... ok (0.001s) 2022-11-23T02:13:28.6809687Z test_gather_gpu (__main__.TestDataParallel) ... ok (0.044s) 2022-11-23T02:13:28.6810068Z test_parallel_apply (__main__.TestDataParallel) ... ok (0.005s) 2022-11-23T02:13:28.6810452Z test_parallel_apply_autocast (__main__.TestDataParallel) ... ok (0.006s) 2022-11-23T02:13:28.6810884Z test_parallel_apply_passes_exception (__main__.TestDataParallel) ... ok (0.002s) 2022-11-23T02:13:28.6811322Z test_parameter_list_dict_replica (__main__.TestDataParallel) ... ok (0.008s) 2022-11-23T02:13:28.6811700Z test_replicate (__main__.TestDataParallel) ... ok (0.004s) 2022-11-23T02:13:28.6812087Z test_replicate_buffers (__main__.TestDataParallel) ... ok (0.003s) 2022-11-23T02:13:28.6812484Z test_save_replica_module (__main__.TestDataParallel) ... ok (0.003s) 2022-11-23T02:13:28.6812866Z test_scatter_cpu (__main__.TestDataParallel) ... ok (0.019s) 2022-11-23T02:13:28.6813224Z test_scatter_gpu (__main__.TestDataParallel) ... ok (0.019s) 2022-11-23T02:13:28.6813608Z test_strided_grad_layout (__main__.TestDataParallel) ... ok (1.196s) 2022-11-23T02:13:28.6813987Z test_zero_grad (__main__.TestDataParallel) ... ok (0.008s) 2022-11-23T02:13:28.6814417Z test_data_parallel_module_cuda_float16 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.005s) 2022-11-23T02:13:28.6814936Z test_data_parallel_module_cuda_float32 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.005s) 2022-11-23T02:13:28.6815447Z test_data_parallel_module_cuda_float64 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.004s) 2022-11-23T02:13:28.6815980Z test_data_parallel_module_kwargs_only_cuda_float16 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.005s) 2022-11-23T02:13:28.6816503Z test_data_parallel_module_kwargs_only_cuda_float32 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.005s) 2022-11-23T02:13:28.6817133Z test_data_parallel_module_kwargs_only_cuda_float64 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.005s) 2022-11-23T02:13:28.6817712Z test_data_parallel_module_kwargs_only_empty_dict_cuda_float16 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.005s) 2022-11-23T02:13:28.6818283Z test_data_parallel_module_kwargs_only_empty_dict_cuda_float32 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.005s) 2022-11-23T02:13:28.6818837Z test_data_parallel_module_kwargs_only_empty_dict_cuda_float64 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.005s) 2022-11-23T02:13:28.6819396Z test_data_parallel_module_kwargs_only_empty_list_cuda_float16 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.005s) 2022-11-23T02:13:28.6820026Z test_data_parallel_module_kwargs_only_empty_list_cuda_float32 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.005s) 2022-11-23T02:13:28.6820573Z test_data_parallel_module_kwargs_only_empty_list_cuda_float64 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.005s) 2022-11-23T02:13:28.6821145Z test_data_parallel_module_kwargs_only_empty_tuple_cuda_float16 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.005s) 2022-11-23T02:13:28.6821721Z test_data_parallel_module_kwargs_only_empty_tuple_cuda_float32 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.005s) 2022-11-23T02:13:28.6822284Z test_data_parallel_module_kwargs_only_empty_tuple_cuda_float64 (__main__.TestDataParallelDeviceTypeCUDA) ... ok (0.005s) 2022-11-23T02:13:28.6822589Z 2022-11-23T02:13:28.6822858Z ---------------------------------------------------------------------- 2022-11-23T02:13:28.6823197Z Ran 46 tests in 4.698s 2022-11-23T02:13:28.6823363Z 2022-11-23T02:13:28.6823458Z OK 2022-11-23T02:13:28.6823594Z 2022-11-23T02:13:28.6823725Z Generating XML reports... 2022-11-23T02:13:28.6824300Z Generated XML report: test-reports/python-unittest/distributed.test_data_parallel/TEST-TestDataParallel-20221123021323.xml 2022-11-23T02:13:28.6825117Z Generated XML report: test-reports/python-unittest/distributed.test_data_parallel/TEST-TestDataParallelDeviceTypeCUDA-20221123021323.xml 2022-11-23T02:13:28.6825509Z 2022-11-23T02:13:28.6825852Z ##[endgroup] 2022-11-23T02:13:28.6826425Z FINISHED PRINTING LOG FILE of distributed/test_data_parallel (/var/lib/jenkins/workspace/test/test-reports/distributed-test_data_parallel_72136i1x) 2022-11-23T02:13:28.6826777Z 2022-11-23T02:13:28.6827070Z Running distributed/elastic/utils/distributed_test ... [2022-11-23 02:13:28.679099] 2022-11-23T02:13:28.6827784Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/elastic/utils/distributed_test.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:13:28.679473] 2022-11-23T02:13:35.8223275Z 2022-11-23T02:13:35.8224013Z Expand the folded group to see the log file of distributed/elastic/utils/distributed_test 2022-11-23T02:13:35.8225031Z ##[group]PRINTING LOG FILE of distributed/elastic/utils/distributed_test (/var/lib/jenkins/workspace/test/test-reports/distributed-elastic-utils-distributed_test_oojo5p9d) 2022-11-23T02:13:35.8225454Z 2022-11-23T02:13:35.8225576Z Running tests... 2022-11-23T02:13:35.8226104Z ---------------------------------------------------------------------- 2022-11-23T02:13:35.8226711Z Test results will be stored in test-reports/python-unittest/distributed.elastic.utils.distributed_test 2022-11-23T02:13:35.8227211Z test_create_store_multi (__main__.DistributedUtilTest) ... ok (1.689s) 2022-11-23T02:13:35.8227639Z test_create_store_no_port_multi (__main__.DistributedUtilTest) ... ok (0.001s) 2022-11-23T02:13:35.8228759Z test_create_store_single_server (__main__.DistributedUtilTest) ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/66207 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.000s) 2022-11-23T02:13:35.8229972Z test_create_store_timeout_on_server (__main__.DistributedUtilTest) ... ok (3.029s) 2022-11-23T02:13:35.8230781Z test_create_store_timeout_on_worker (__main__.DistributedUtilTest) ... [E socket.cpp:860] [c10d] The client socket has timed out after 1s while trying to connect to (08317a7e7676, 0). 2022-11-23T02:13:35.8231236Z ok (0.001s) 2022-11-23T02:13:35.8231891Z test_port_already_in_use_on_server (__main__.DistributedUtilTest) ... [W socket.cpp:426] [c10d] The server socket has failed to bind to [::]:36003 (errno: 98 - Address already in use). 2022-11-23T02:13:35.8232560Z [W socket.cpp:426] [c10d] The server socket has failed to bind to 0.0.0.0:36003 (errno: 98 - Address already in use). 2022-11-23T02:13:35.8233035Z [E socket.cpp:462] [c10d] The server socket has failed to listen on any local network address. 2022-11-23T02:13:35.8233493Z ok (0.004s) 2022-11-23T02:13:35.8233931Z test_port_already_in_use_on_worker (__main__.DistributedUtilTest) ... [E socket.cpp:860] [c10d] The client socket has timed out after 1s while trying to connect to (08317a7e7676, 59513). 2022-11-23T02:13:35.8234357Z ok (0.004s) 2022-11-23T02:13:35.8234512Z 2022-11-23T02:13:35.8234787Z ---------------------------------------------------------------------- 2022-11-23T02:13:35.8235117Z Ran 7 tests in 4.730s 2022-11-23T02:13:35.8235263Z 2022-11-23T02:13:35.8235371Z OK (skipped=1) 2022-11-23T02:13:35.8235525Z 2022-11-23T02:13:35.8235648Z Generating XML reports... 2022-11-23T02:13:35.8236293Z Generated XML report: test-reports/python-unittest/distributed.elastic.utils.distributed_test/TEST-DistributedUtilTest-20221123021330.xml 2022-11-23T02:13:35.8236685Z 2022-11-23T02:13:35.8236983Z ##[endgroup] 2022-11-23T02:13:35.8237643Z FINISHED PRINTING LOG FILE of distributed/elastic/utils/distributed_test (/var/lib/jenkins/workspace/test/test-reports/distributed-elastic-utils-distributed_test_oojo5p9d) 2022-11-23T02:13:35.8238053Z 2022-11-23T02:13:35.8238332Z Running distributed/fsdp/test_fsdp_uneven ... [2022-11-23 02:13:35.822344] 2022-11-23T02:13:35.8239024Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_uneven.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:13:35.822620] 2022-11-23T02:13:44.1404950Z 2022-11-23T02:13:44.1406222Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_uneven 2022-11-23T02:13:44.1407878Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_uneven (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_uneven_3d95zz89) 2022-11-23T02:13:44.1408550Z 2022-11-23T02:13:44.1408758Z Running tests... 2022-11-23T02:13:44.1409617Z ---------------------------------------------------------------------- 2022-11-23T02:13:44.1410649Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_uneven 2022-11-23T02:13:44.1411505Z test_one_iteration (__main__.TestUnevenParamShard) 2022-11-23T02:13:44.1412265Z Test FSDP with uneven divide of parameter shards. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:13:44.1412913Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40420 2022-11-23T02:13:44.1413369Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40421 2022-11-23T02:13:44.1414039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:44.1414508Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:44.1415203Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:44.1416018Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:44.1416634Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:44.1417097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:44.1417668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:44.1418377Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:44.1418867Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:13:44.1419357Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:13:44.1420038Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:13:44.1420745Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:13:44.1421279Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:44.1421849Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:44.1422208Z dist init r=0, world=2 2022-11-23T02:13:44.1422462Z dist init r=1, world=2 2022-11-23T02:13:44.1422686Z ok (6.018s) 2022-11-23T02:13:44.1422839Z 2022-11-23T02:13:44.1423119Z ---------------------------------------------------------------------- 2022-11-23T02:13:44.1423451Z Ran 1 test in 6.019s 2022-11-23T02:13:44.1423615Z 2022-11-23T02:13:44.1423710Z OK 2022-11-23T02:13:44.1423873Z 2022-11-23T02:13:44.1423983Z Generating XML reports... 2022-11-23T02:13:44.1424605Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_uneven/TEST-TestUnevenParamShard-20221123021337.xml 2022-11-23T02:13:44.1424974Z 2022-11-23T02:13:44.1425323Z ##[endgroup] 2022-11-23T02:13:44.1425915Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_uneven (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_uneven_3d95zz89) 2022-11-23T02:13:44.1426284Z 2022-11-23T02:13:44.1426567Z Running distributed/fsdp/test_fsdp_pure_fp16 ... [2022-11-23 02:13:44.140532] 2022-11-23T02:13:44.1427266Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_pure_fp16.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:13:44.141003] 2022-11-23T02:13:52.4867215Z 2022-11-23T02:13:52.4867716Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_pure_fp16 2022-11-23T02:13:52.4868688Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_pure_fp16 (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_pure_fp16_mc4kpvub) 2022-11-23T02:13:52.4869503Z 2022-11-23T02:13:52.4869635Z Running tests... 2022-11-23T02:13:52.4870164Z ---------------------------------------------------------------------- 2022-11-23T02:13:52.4870750Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_pure_fp16 2022-11-23T02:13:52.4871233Z test_pure_fp16_cpu_offload_CPUOffload(offload_params=False) (__main__.TestPureFP16) 2022-11-23T02:13:52.4871859Z Tests pure FP16 training, including when the parameter's dtype is ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:13:52.4872924Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/73315 for platform(s) linux. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.625s) 2022-11-23T02:13:52.4873663Z test_pure_fp16_cpu_offload_CPUOffload(offload_params=True) (__main__.TestPureFP16) 2022-11-23T02:13:52.4874344Z Tests pure FP16 training, including when the parameter's dtype is ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40538 2022-11-23T02:13:52.4874895Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40539 2022-11-23T02:13:52.4875540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:52.4876360Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:52.4877400Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:52.4878685Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:52.4879598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:13:52.4880065Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:13:52.4880649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:13:52.4881111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:13:52.4881577Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:13:52.4882207Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:13:52.4882888Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:13:52.4883579Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:13:52.4884109Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:13:52.4884590Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:13:52.4885877Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:13:52.4886665Z warnings.warn( 2022-11-23T02:13:52.4887887Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:13:52.4888665Z warnings.warn( 2022-11-23T02:13:52.4888931Z File "", line 1, in 2022-11-23T02:13:52.4889288Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:13:52.4889660Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:13:52.4890041Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:13:52.4890416Z return self._bootstrap(parent_sentinel) 2022-11-23T02:13:52.4890794Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:13:52.4891134Z self.run() 2022-11-23T02:13:52.4891474Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:13:52.4891825Z self._target(*self._args, **self._kwargs) 2022-11-23T02:13:52.4892346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:13:52.4892740Z self.run_test(test_name, pipe) 2022-11-23T02:13:52.4893253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:13:52.4893647Z getattr(self, test_name)() 2022-11-23T02:13:52.4894166Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:13:52.4894541Z fn() 2022-11-23T02:13:52.4895018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:13:52.4895413Z test(self, **param_kwargs) 2022-11-23T02:13:52.4896063Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:13:52.4896453Z return func(*args, **kwargs) 2022-11-23T02:13:52.4896851Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_pure_fp16.py", line 47, in test_pure_fp16 2022-11-23T02:13:52.4897228Z self._test_fsdp_parity( 2022-11-23T02:13:52.4897754Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:13:52.4898165Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:13:52.4898726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:13:52.4899194Z output = model(*input) 2022-11-23T02:13:52.4899660Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:13:52.4900052Z return forward_call(*input, **kwargs) 2022-11-23T02:13:52.4900604Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:13:52.4901065Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:13:52.4901621Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:13:52.4902012Z _lazy_init(state, module) 2022-11-23T02:13:52.4902520Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:13:52.4902911Z handle.init_flat_param_attributes() 2022-11-23T02:13:52.4903428Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:13:52.4903817Z return func(*args, **kwargs) 2022-11-23T02:13:52.4904357Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:13:52.4904729Z p_assert( 2022-11-23T02:13:52.4905199Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:13:52.4905585Z traceback.print_stack() 2022-11-23T02:13:52.4905852Z File "", line 1, in 2022-11-23T02:13:52.4906224Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:13:52.4906598Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:13:52.4906957Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:13:52.4907329Z return self._bootstrap(parent_sentinel) 2022-11-23T02:13:52.4907721Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:13:52.4908063Z self.run() 2022-11-23T02:13:52.4908384Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:13:52.4908753Z self._target(*self._args, **self._kwargs) 2022-11-23T02:13:52.4909698Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:13:52.4910082Z self.run_test(test_name, pipe) 2022-11-23T02:13:52.4910616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:13:52.4911016Z getattr(self, test_name)() 2022-11-23T02:13:52.4911517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:13:52.4911886Z fn() 2022-11-23T02:13:52.4912382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:13:52.4912787Z test(self, **param_kwargs) 2022-11-23T02:13:52.4913290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:13:52.4913685Z return func(*args, **kwargs) 2022-11-23T02:13:52.4914181Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_pure_fp16.py", line 47, in test_pure_fp16 2022-11-23T02:13:52.4914555Z self._test_fsdp_parity( 2022-11-23T02:13:52.4915081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:13:52.4915510Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:13:52.4916071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:13:52.4916452Z output = model(*input) 2022-11-23T02:13:52.4916931Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:13:52.4917394Z return forward_call(*input, **kwargs) 2022-11-23T02:13:52.4917932Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:13:52.4918393Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:13:52.4918966Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:13:52.4919362Z _lazy_init(state, module) 2022-11-23T02:13:52.4919855Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:13:52.4920266Z handle.init_flat_param_attributes() 2022-11-23T02:13:52.4920785Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:13:52.4921153Z return func(*args, **kwargs) 2022-11-23T02:13:52.4921690Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:13:52.4922081Z p_assert( 2022-11-23T02:13:52.4922539Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:13:52.4922926Z traceback.print_stack() 2022-11-23T02:13:52.4923195Z dist init r=0, world=2 2022-11-23T02:13:52.4923675Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:13:52.4924110Z dist init r=1, world=2 2022-11-23T02:13:52.4924582Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:13:52.4925014Z ok (4.413s) 2022-11-23T02:13:52.4925165Z 2022-11-23T02:13:52.4925421Z ---------------------------------------------------------------------- 2022-11-23T02:13:52.4925752Z Ran 2 tests in 6.038s 2022-11-23T02:13:52.4925919Z 2022-11-23T02:13:52.4926026Z OK (skipped=1) 2022-11-23T02:13:52.4926179Z 2022-11-23T02:13:52.4926303Z Generating XML reports... 2022-11-23T02:13:52.4926869Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_pure_fp16/TEST-TestPureFP16-20221123021346.xml 2022-11-23T02:13:52.4927217Z 2022-11-23T02:13:52.4927549Z ##[endgroup] 2022-11-23T02:13:52.4928159Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_pure_fp16 (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_pure_fp16_mc4kpvub) 2022-11-23T02:13:52.4928526Z 2022-11-23T02:13:52.4928815Z Running distributed/_shard/sharded_tensor/ops/test_softmax ... [2022-11-23 02:13:52.486829] 2022-11-23T02:13:52.4929556Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_softmax.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:13:52.487138] 2022-11-23T02:14:01.2859525Z 2022-11-23T02:14:01.2860066Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_softmax 2022-11-23T02:14:01.2861123Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_softmax (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_softmax_6oc_3lkt) 2022-11-23T02:14:01.2862431Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfcmuafhf 2022-11-23T02:14:01.2879411Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfcmuafhf/_remote_module_non_scriptable.py 2022-11-23T02:14:01.2879872Z 2022-11-23T02:14:01.2879988Z Running tests... 2022-11-23T02:14:01.2880554Z ---------------------------------------------------------------------- 2022-11-23T02:14:01.2881161Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_softmax 2022-11-23T02:14:01.2881708Z test_sharded_softmax_basic (__main__.TestShardedSoftmax) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:14:01.2882180Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40656 2022-11-23T02:14:01.2882846Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40657 2022-11-23T02:14:01.2883313Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 40658 2022-11-23T02:14:01.2883760Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 40659 2022-11-23T02:14:01.2884402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:01.2884871Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:01.2885463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:01.2885928Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:01.2886517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:01.2886983Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:01.2887601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:01.2888088Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:01.2888679Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:01.2889136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:01.2889699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:01.2890179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:01.2890760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:01.2891217Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:01.2891781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:01.2892254Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:01.2892731Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi6z12u9s 2022-11-23T02:14:01.2893262Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi6z12u9s/_remote_module_non_scriptable.py 2022-11-23T02:14:01.2893807Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2_10r9zu 2022-11-23T02:14:01.2894353Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2_10r9zu/_remote_module_non_scriptable.py 2022-11-23T02:14:01.2894869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:14:01.2895357Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkcrz9d2u 2022-11-23T02:14:01.2895908Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkcrz9d2u/_remote_module_non_scriptable.py 2022-11-23T02:14:01.2896429Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:14:01.2896983Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:01.2897507Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkxfegrmp 2022-11-23T02:14:01.2898058Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkxfegrmp/_remote_module_non_scriptable.py 2022-11-23T02:14:01.2898574Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:01.2898958Z skip: Need at least 4 CUDA devices (4.070s) 2022-11-23T02:14:01.2899460Z test_sharded_softmax_on_sharding_dim (__main__.TestShardedSoftmax) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40792 2022-11-23T02:14:01.2900067Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40793 2022-11-23T02:14:01.2900529Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 40794 2022-11-23T02:14:01.2900972Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 40795 2022-11-23T02:14:01.2901594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:01.2902060Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:01.2902630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:01.2903113Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:01.2903704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:01.2904162Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:01.2904731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:01.2905208Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:01.2905802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:01.2906237Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:01.2906817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:01.2907292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:01.2907876Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:01.2908309Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:01.2908891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:01.2909653Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:01.2910139Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwnujalkl 2022-11-23T02:14:01.2910677Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwnujalkl/_remote_module_non_scriptable.py 2022-11-23T02:14:01.2911223Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3qj4dhwr 2022-11-23T02:14:01.2911772Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3qj4dhwr/_remote_module_non_scriptable.py 2022-11-23T02:14:01.2912295Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmped5mwyux 2022-11-23T02:14:01.2912844Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmped5mwyux/_remote_module_non_scriptable.py 2022-11-23T02:14:01.2913393Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi06fqxdp 2022-11-23T02:14:01.2913942Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi06fqxdp/_remote_module_non_scriptable.py 2022-11-23T02:14:01.2914533Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:01.2915028Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:14:01.2915506Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:01.2915967Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:14:01.2916367Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:14:01.2916566Z 2022-11-23T02:14:01.2916853Z ---------------------------------------------------------------------- 2022-11-23T02:14:01.2917187Z Ran 2 tests in 6.480s 2022-11-23T02:14:01.2917338Z 2022-11-23T02:14:01.2917450Z OK (skipped=2) 2022-11-23T02:14:01.2917686Z 2022-11-23T02:14:01.2917813Z Generating XML reports... 2022-11-23T02:14:01.2918458Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_softmax/TEST-TestShardedSoftmax-20221123021354.xml 2022-11-23T02:14:01.2918844Z 2022-11-23T02:14:01.2919182Z ##[endgroup] 2022-11-23T02:14:01.2919857Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_softmax (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_softmax_6oc_3lkt) 2022-11-23T02:14:01.2920264Z 2022-11-23T02:14:01.2920567Z Running distributed/_shard/sharded_tensor/ops/test_chunk ... [2022-11-23 02:14:01.286025] 2022-11-23T02:14:01.2921296Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_chunk.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:14:01.286306] 2022-11-23T02:14:10.0273519Z 2022-11-23T02:14:10.0274111Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_chunk 2022-11-23T02:14:10.0275146Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_chunk (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_chunk_0gwqznu1) 2022-11-23T02:14:10.0275570Z 2022-11-23T02:14:10.0275699Z Running tests... 2022-11-23T02:14:10.0276213Z ---------------------------------------------------------------------- 2022-11-23T02:14:10.0276824Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_chunk 2022-11-23T02:14:10.0277378Z test_sharded_chunk (__main__.TestShardedTensorChunkOps) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:14:10.0277868Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 40963 2022-11-23T02:14:10.0278332Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 40964 2022-11-23T02:14:10.0278786Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 40965 2022-11-23T02:14:10.0279244Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 40966 2022-11-23T02:14:10.0279867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:10.0280333Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:10.0280922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:10.0281386Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:10.0281980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:10.0282434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:10.0283053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:10.0283536Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:10.0284104Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:10.0284553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:10.0285387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:10.0285876Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:10.0286472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:10.0286925Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:10.0287565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:10.0288023Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:10.0288606Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:10.0289091Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:14:10.0289561Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:10.0290033Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:14:10.0290424Z skip: Need at least 4 CUDA devices (4.017s) 2022-11-23T02:14:10.0290929Z test_sharded_chunk_error (__main__.TestShardedTensorChunkOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41099 2022-11-23T02:14:10.0291466Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41100 2022-11-23T02:14:10.0291921Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 41101 2022-11-23T02:14:10.0292372Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 41102 2022-11-23T02:14:10.0292985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:10.0293441Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:10.0294034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:10.0294511Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:10.0295084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:10.0295542Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:10.0296115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:10.0296570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:10.0297137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:10.0297618Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:10.0298219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:10.0298674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:10.0299263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:10.0299720Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:10.0300302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:10.0300756Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:10.0301209Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:14:10.0301693Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:10.0302152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:14:10.0302705Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:10.0303109Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:14:10.0303309Z 2022-11-23T02:14:10.0303586Z ---------------------------------------------------------------------- 2022-11-23T02:14:10.0303898Z Ran 2 tests in 6.428s 2022-11-23T02:14:10.0304063Z 2022-11-23T02:14:10.0304173Z OK (skipped=2) 2022-11-23T02:14:10.0304328Z 2022-11-23T02:14:10.0304454Z Generating XML reports... 2022-11-23T02:14:10.0305104Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_chunk/TEST-TestShardedTensorChunkOps-20221123021403.xml 2022-11-23T02:14:10.0305563Z 2022-11-23T02:14:10.0305883Z ##[endgroup] 2022-11-23T02:14:10.0306549Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_chunk (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_chunk_0gwqznu1) 2022-11-23T02:14:10.0306941Z 2022-11-23T02:14:10.0307219Z Running distributed/test_c10d_error_logger ... [2022-11-23 02:14:10.027446] 2022-11-23T02:14:10.0307897Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_error_logger.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:14:10.027718] 2022-11-23T02:14:20.1890239Z 2022-11-23T02:14:20.1890881Z Expand the folded group to see the log file of distributed/test_c10d_error_logger 2022-11-23T02:14:20.1891879Z ##[group]PRINTING LOG FILE of distributed/test_c10d_error_logger (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_error_logger_mpfobxyt) 2022-11-23T02:14:20.1892250Z 2022-11-23T02:14:20.1892366Z Running tests... 2022-11-23T02:14:20.1893139Z ---------------------------------------------------------------------- 2022-11-23T02:14:20.1893740Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_error_logger 2022-11-23T02:14:20.1894276Z test_exception_handler_with_dist (__main__.C10dErrorLoggerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:14:20.1894948Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41270 2022-11-23T02:14:20.1895525Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41271 2022-11-23T02:14:20.1896153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:20.1896615Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:20.1897201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:20.1897664Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:20.1898260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:20.1898712Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:20.1899299Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:20.1899760Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:20.1900205Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:20.1900706Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:14:20.1901209Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:20.1901682Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:14:20.1902356Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:20.1903085Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:14:20.1903730Z ok (5.549s) 2022-11-23T02:14:20.1904201Z test_get_or_create_logger (__main__.C10dErrorLoggerTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41349 2022-11-23T02:14:20.1904772Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41350 2022-11-23T02:14:20.1905433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:20.1905900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:20.1906515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:20.1907133Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:20.1907761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:20.1908235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:20.1908850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:20.1909633Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:20.1910091Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:20.1910605Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:20.1910974Z ok (2.309s) 2022-11-23T02:14:20.1911129Z 2022-11-23T02:14:20.1911423Z ---------------------------------------------------------------------- 2022-11-23T02:14:20.1911762Z Ran 2 tests in 7.859s 2022-11-23T02:14:20.1911933Z 2022-11-23T02:14:20.1912026Z OK 2022-11-23T02:14:20.1912164Z 2022-11-23T02:14:20.1912293Z Generating XML reports... 2022-11-23T02:14:20.1912915Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_error_logger/TEST-C10dErrorLoggerTest-20221123021411.xml 2022-11-23T02:14:20.1913291Z 2022-11-23T02:14:20.1913627Z ##[endgroup] 2022-11-23T02:14:20.1914257Z FINISHED PRINTING LOG FILE of distributed/test_c10d_error_logger (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_error_logger_mpfobxyt) 2022-11-23T02:14:20.1914639Z 2022-11-23T02:14:20.1914962Z Running distributed/_shard/sharding_spec/test_sharding_spec ... [2022-11-23 02:14:20.189030] 2022-11-23T02:14:20.1915741Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharding_spec/test_sharding_spec.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:14:20.189298] 2022-11-23T02:14:31.3545167Z 2022-11-23T02:14:31.3546023Z Expand the folded group to see the log file of distributed/_shard/sharding_spec/test_sharding_spec 2022-11-23T02:14:31.3547050Z ##[group]PRINTING LOG FILE of distributed/_shard/sharding_spec/test_sharding_spec (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharding_spec-test_sharding_spec_m3lpjlmj) 2022-11-23T02:14:31.3547820Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvz1hjxp9 2022-11-23T02:14:31.3549402Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvz1hjxp9/_remote_module_non_scriptable.py 2022-11-23T02:14:31.3549989Z 2022-11-23T02:14:31.3550175Z Running tests... 2022-11-23T02:14:31.3551028Z ---------------------------------------------------------------------- 2022-11-23T02:14:31.3552116Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharding_spec.test_sharding_spec 2022-11-23T02:14:31.3553059Z test_custom_sharding_spec (__main__.TestCustomShardingSpec) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:14:31.3553960Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41452 2022-11-23T02:14:31.3554744Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41453 2022-11-23T02:14:31.3555470Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 41454 2022-11-23T02:14:31.3556463Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 41455 2022-11-23T02:14:31.3557456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:31.3558153Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:31.3559032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:31.3559738Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:31.3560614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:31.3561431Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:31.3562454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:31.3562951Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:31.3563547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:31.3563982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:31.3564570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:31.3565043Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:31.3565631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:31.3566067Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:31.3566653Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:31.3567127Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:31.3567602Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfcm9pf2g 2022-11-23T02:14:31.3568146Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfcm9pf2g/_remote_module_non_scriptable.py 2022-11-23T02:14:31.3568693Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmbzxagmw 2022-11-23T02:14:31.3569243Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmbzxagmw/_remote_module_non_scriptable.py 2022-11-23T02:14:31.3569748Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:14:31.3570254Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmput473kst 2022-11-23T02:14:31.3570804Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmput473kst/_remote_module_non_scriptable.py 2022-11-23T02:14:31.3571321Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:31.3571788Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:14:31.3572285Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2fm6fzzc 2022-11-23T02:14:31.3572835Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2fm6fzzc/_remote_module_non_scriptable.py 2022-11-23T02:14:31.3573330Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:31.3573680Z ok (4.007s) 2022-11-23T02:14:31.3574026Z test_custom_sharding_spec_shard_tensor (__main__.TestCustomShardingSpec) 2022-11-23T02:14:31.3574543Z Test custom spec can be invoked from the ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41588 2022-11-23T02:14:31.3575041Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41589 2022-11-23T02:14:31.3575502Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 41590 2022-11-23T02:14:31.3576038Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 41591 2022-11-23T02:14:31.3576655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:31.3577111Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:31.3577696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:31.3578172Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:31.3578738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:31.3579257Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:31.3579839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:31.3580316Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:31.3580884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:31.3581340Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:31.3581917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:31.3582372Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:31.3582956Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:31.3583411Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:31.3583989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:31.3584438Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:31.3584923Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc5gh3c2k 2022-11-23T02:14:31.3585478Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc5gh3c2k/_remote_module_non_scriptable.py 2022-11-23T02:14:31.3586001Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr1eugajg 2022-11-23T02:14:31.3586547Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr1eugajg/_remote_module_non_scriptable.py 2022-11-23T02:14:31.3587118Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:14:31.3587599Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:31.3588089Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpltq26_7n 2022-11-23T02:14:31.3588630Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpltq26_7n/_remote_module_non_scriptable.py 2022-11-23T02:14:31.3589381Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptm0vap0p 2022-11-23T02:14:31.3589930Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptm0vap0p/_remote_module_non_scriptable.py 2022-11-23T02:14:31.3590429Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:14:31.3590909Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:31.3591306Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:14:31.3591677Z test_custom_sharding_spec_tensor_ctor (__main__.TestCustomShardingSpec) 2022-11-23T02:14:31.3592198Z Test sharded_tensor.ones(...) with the custom ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41724 2022-11-23T02:14:31.3592721Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 41725 2022-11-23T02:14:31.3593178Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 41726 2022-11-23T02:14:31.3593703Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 41727 2022-11-23T02:14:31.3594341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:31.3594803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:31.3595369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:31.3595884Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:31.3596470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:31.3597006Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:31.3597575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:31.3598054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:31.3598639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:31.3599088Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:31.3599652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:31.3600126Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:31.3600704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:31.3601155Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:31.3601720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:31.3602196Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:31.3602667Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp744oaodg 2022-11-23T02:14:31.3603194Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp744oaodg/_remote_module_non_scriptable.py 2022-11-23T02:14:31.3603734Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_1t_kxab 2022-11-23T02:14:31.3604276Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_1t_kxab/_remote_module_non_scriptable.py 2022-11-23T02:14:31.3604791Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:14:31.3605279Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp625li_ja 2022-11-23T02:14:31.3605821Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp625li_ja/_remote_module_non_scriptable.py 2022-11-23T02:14:31.3606357Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9w62sqbv 2022-11-23T02:14:31.3606878Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9w62sqbv/_remote_module_non_scriptable.py 2022-11-23T02:14:31.3607392Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:31.3607867Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:14:31.3608339Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:31.3608719Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:14:31.3609089Z test_check_overlapping (__main__.TestShardingSpec) ... ok (0.003s) 2022-11-23T02:14:31.3609497Z test_chunked_sharding_spec (__main__.TestShardingSpec) ... ok (0.012s) 2022-11-23T02:14:31.3609889Z test_device_placement (__main__.TestShardingSpec) ... ok (0.007s) 2022-11-23T02:14:31.3610298Z test_enumerable_sharding_spec (__main__.TestShardingSpec) ... ok (0.007s) 2022-11-23T02:14:31.3610784Z test_get_chunk_sharding_params (__main__.TestShardingSpec) ... ok (0.002s) 2022-11-23T02:14:31.3611199Z test_get_chunked_dim_size (__main__.TestShardingSpec) ... ok (0.001s) 2022-11-23T02:14:31.3611571Z test_get_split_size (__main__.TestShardingSpec) ... ok (0.001s) 2022-11-23T02:14:31.3611998Z test_infer_sharding_spec_from_shards_metadata (__main__.TestShardingSpec) ... ok (0.010s) 2022-11-23T02:14:31.3612259Z 2022-11-23T02:14:31.3612538Z ---------------------------------------------------------------------- 2022-11-23T02:14:31.3612851Z Ran 11 tests in 8.871s 2022-11-23T02:14:31.3613017Z 2022-11-23T02:14:31.3613126Z OK (skipped=2) 2022-11-23T02:14:31.3613281Z 2022-11-23T02:14:31.3613406Z Generating XML reports... 2022-11-23T02:14:31.3614133Z Generated XML report: test-reports/python-unittest/distributed._shard.sharding_spec.test_sharding_spec/TEST-TestCustomShardingSpec-20221123021422.xml 2022-11-23T02:14:31.3614961Z Generated XML report: test-reports/python-unittest/distributed._shard.sharding_spec.test_sharding_spec/TEST-TestShardingSpec-20221123021422.xml 2022-11-23T02:14:31.3615339Z 2022-11-23T02:14:31.3615685Z ##[endgroup] 2022-11-23T02:14:31.3616367Z FINISHED PRINTING LOG FILE of distributed/_shard/sharding_spec/test_sharding_spec (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharding_spec-test_sharding_spec_m3lpjlmj) 2022-11-23T02:14:31.3616778Z 2022-11-23T02:14:31.3617031Z Running distributed/fsdp/test_fsdp_input ... [2022-11-23 02:14:31.354563] 2022-11-23T02:14:31.3617720Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_input.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:14:31.354954] 2022-11-23T02:14:43.6769848Z 2022-11-23T02:14:43.6770295Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_input 2022-11-23T02:14:43.6771316Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_input (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_input_sqo8atg4) 2022-11-23T02:14:43.6771691Z 2022-11-23T02:14:43.6771833Z Running tests... 2022-11-23T02:14:43.6772364Z ---------------------------------------------------------------------- 2022-11-23T02:14:43.6772910Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_input 2022-11-23T02:14:43.6773337Z test_input_type_dict (__main__.TestInput) 2022-11-23T02:14:43.6773766Z Test FSDP with input being a list or a dict, only single GPU. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:14:43.6774264Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41895 2022-11-23T02:14:43.6774888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:43.6775361Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:43.6775952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:43.6776419Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:43.6776890Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:14:43.6777567Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:14:43.6778112Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:43.6779391Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:14:43.6780202Z warnings.warn( 2022-11-23T02:14:43.6780457Z dist init r=0, world=1 2022-11-23T02:14:43.6780948Z ok (5.875s) 2022-11-23T02:14:43.6781231Z test_input_type_list (__main__.TestInput) 2022-11-23T02:14:43.6781717Z Test FSDP with input being a list or a dict, only single GPU. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 41937 2022-11-23T02:14:43.6782421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:43.6782885Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:43.6783456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:43.6783937Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:43.6784511Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:14:43.6785196Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:14:43.6785715Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:43.6787069Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:14:43.6787866Z warnings.warn( 2022-11-23T02:14:43.6788124Z dist init r=0, world=1 2022-11-23T02:14:43.6788355Z ok (4.111s) 2022-11-23T02:14:43.6788507Z 2022-11-23T02:14:43.6788781Z ---------------------------------------------------------------------- 2022-11-23T02:14:43.6789583Z Ran 2 tests in 9.986s 2022-11-23T02:14:43.6789877Z 2022-11-23T02:14:43.6790032Z OK 2022-11-23T02:14:43.6790249Z 2022-11-23T02:14:43.6790465Z Generating XML reports... 2022-11-23T02:14:43.6791249Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_input/TEST-TestInput-20221123021433.xml 2022-11-23T02:14:43.6791584Z 2022-11-23T02:14:43.6791913Z ##[endgroup] 2022-11-23T02:14:43.6792498Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_input (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_input_sqo8atg4) 2022-11-23T02:14:43.6792860Z 2022-11-23T02:14:43.6793188Z Running distributed/_shard/sharded_tensor/ops/test_elementwise_ops ... [2022-11-23 02:14:43.677045] 2022-11-23T02:14:43.6793957Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_elementwise_ops.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:14:43.677345] 2022-11-23T02:14:57.2360391Z 2022-11-23T02:14:57.2360931Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_elementwise_ops 2022-11-23T02:14:57.2362019Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_elementwise_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_elementwise_ops_gu9efkw4) 2022-11-23T02:14:57.2362469Z 2022-11-23T02:14:57.2362589Z Running tests... 2022-11-23T02:14:57.2363114Z ---------------------------------------------------------------------- 2022-11-23T02:14:57.2363738Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_elementwise_ops 2022-11-23T02:14:57.2364310Z test_sharded_dropout (__main__.TestShardedTensorElementWiseOps) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:14:57.2364837Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42014 2022-11-23T02:14:57.2365300Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42015 2022-11-23T02:14:57.2365733Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 42016 2022-11-23T02:14:57.2366445Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 42017 2022-11-23T02:14:57.2367116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2367578Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2368152Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2368630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2369219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2369786Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2370353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2370831Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2371420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2371852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2372429Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2372899Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2373480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2373979Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2374547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2375035Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2375491Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:57.2375953Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:57.2376433Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:14:57.2376895Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:14:57.2377296Z skip: Need at least 4 CUDA devices (4.018s) 2022-11-23T02:14:57.2377791Z test_sharded_gelu (__main__.TestShardedTensorElementWiseOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42150 2022-11-23T02:14:57.2378358Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42151 2022-11-23T02:14:57.2378814Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 42152 2022-11-23T02:14:57.2379251Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 42153 2022-11-23T02:14:57.2379882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2380344Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2380925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2381386Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2381971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2382430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2383007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2383461Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2384115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2384582Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2385146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2385620Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2386205Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2386701Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2387337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2387814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2388265Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:14:57.2388734Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:14:57.2389503Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:57.2389982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:57.2390383Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:14:57.2390881Z test_sharded_relu (__main__.TestShardedTensorElementWiseOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42286 2022-11-23T02:14:57.2391449Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42287 2022-11-23T02:14:57.2391904Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 42288 2022-11-23T02:14:57.2392355Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 42289 2022-11-23T02:14:57.2392971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2393433Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2394016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2394474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2395059Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2395509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2396097Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2396549Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2397141Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2397588Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2398154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2398626Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2399209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2399656Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2400215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2400690Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2401135Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:57.2401723Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:14:57.2402197Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:14:57.2402667Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:57.2403065Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:14:57.2403575Z test_sharded_tensor_nan_to_num (__main__.TestShardedTensorElementWiseOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42422 2022-11-23T02:14:57.2404151Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42423 2022-11-23T02:14:57.2404685Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 42424 2022-11-23T02:14:57.2405141Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 42425 2022-11-23T02:14:57.2405751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2406211Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2406794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2407256Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2407840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2408290Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2408874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2409339Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2409929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2410378Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2410963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2411418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2411999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:14:57.2412448Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:14:57.2413007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:14:57.2413483Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:14:57.2413922Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:14:57.2414406Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:14:57.2414868Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:14:57.2415336Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:14:57.2415732Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:14:57.2415931Z 2022-11-23T02:14:57.2416196Z ---------------------------------------------------------------------- 2022-11-23T02:14:57.2416530Z Ran 4 tests in 11.249s 2022-11-23T02:14:57.2416696Z 2022-11-23T02:14:57.2416810Z OK (skipped=4) 2022-11-23T02:14:57.2416964Z 2022-11-23T02:14:57.2417095Z Generating XML reports... 2022-11-23T02:14:57.2417791Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_elementwise_ops/TEST-TestShardedTensorElementWiseOps-20221123021445.xml 2022-11-23T02:14:57.2418230Z 2022-11-23T02:14:57.2418562Z ##[endgroup] 2022-11-23T02:14:57.2419420Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_elementwise_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_elementwise_ops_gu9efkw4) 2022-11-23T02:14:57.2419867Z 2022-11-23T02:14:57.2420136Z Running distributed/_shard/test_partial_tensor ... [2022-11-23 02:14:57.236222] 2022-11-23T02:14:57.2420843Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/test_partial_tensor.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:14:57.236555] 2022-11-23T02:15:13.3690768Z 2022-11-23T02:15:13.3691280Z Expand the folded group to see the log file of distributed/_shard/test_partial_tensor 2022-11-23T02:15:13.3693281Z ##[group]PRINTING LOG FILE of distributed/_shard/test_partial_tensor (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-test_partial_tensor_q9145koy) 2022-11-23T02:15:13.3693725Z 2022-11-23T02:15:13.3693843Z Running tests... 2022-11-23T02:15:13.3694392Z ---------------------------------------------------------------------- 2022-11-23T02:15:13.3695295Z Test results will be stored in test-reports/python-unittest/distributed._shard.test_partial_tensor 2022-11-23T02:15:13.3695792Z test_cat (__main__.TestPartialTensorOps) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:13.3696416Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42593 2022-11-23T02:15:13.3696954Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42594 2022-11-23T02:15:13.3697409Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 42595 2022-11-23T02:15:13.3697843Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 42596 2022-11-23T02:15:13.3698492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3698960Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3699530Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3700016Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3700600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3701059Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3701619Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3702096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3702689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3703125Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3703713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3704181Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3704825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3705276Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3705835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3706308Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3706756Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:15:13.3707248Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:13.3707707Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:13.3708303Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:15:13.3708714Z skip: Need at least 4 CUDA devices (4.069s) 2022-11-23T02:15:13.3709783Z test_cat_errors (__main__.TestPartialTensorOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42729 2022-11-23T02:15:13.3710722Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42730 2022-11-23T02:15:13.3711189Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 42731 2022-11-23T02:15:13.3711643Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 42732 2022-11-23T02:15:13.3712270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3712845Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3713437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3713897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3714484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3714932Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3715507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3715959Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3716544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3716999Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3717559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3718029Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3718615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3719064Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3719625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3720098Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3720541Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:15:13.3721025Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:15:13.3721479Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:13.3721946Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:13.3722347Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:15:13.3722811Z test_transpose (__main__.TestPartialTensorOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 42865 2022-11-23T02:15:13.3723339Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 42866 2022-11-23T02:15:13.3723795Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 42867 2022-11-23T02:15:13.3724243Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 42868 2022-11-23T02:15:13.3724849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3725308Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3725889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3726421Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3727022Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3727472Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3728049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3728504Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3729087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3729595Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3730170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3730622Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3731209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3731654Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3732211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3732675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3733119Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:13.3733600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:15:13.3734064Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:15:13.3734535Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:13.3734934Z skip: Need at least 4 CUDA devices (2.510s) 2022-11-23T02:15:13.3735423Z test_partial_tensor_reshard (__main__.TestPartialTensorReshard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43001 2022-11-23T02:15:13.3735974Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43002 2022-11-23T02:15:13.3736425Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 43003 2022-11-23T02:15:13.3736873Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 43004 2022-11-23T02:15:13.3737475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3737931Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3738514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3738992Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3739566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3740012Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3740592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3741045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3741625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3742071Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3742652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3743104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3743736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3744188Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3744751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3745219Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3745659Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:15:13.3746138Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:15:13.3746598Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:13.3747123Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:13.3747518Z skip: Need at least 4 CUDA devices (2.411s) 2022-11-23T02:15:13.3748018Z test_partial_tensor_reshard_errors (__main__.TestPartialTensorReshard) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43137 2022-11-23T02:15:13.3748582Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43138 2022-11-23T02:15:13.3752279Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 43139 2022-11-23T02:15:13.3752763Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 43140 2022-11-23T02:15:13.3753378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3753840Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3754428Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3754907Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3755481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3755927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3756507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3756963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3757546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3757993Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3758569Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3759030Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3759613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:13.3760060Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:13.3760618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:13.3761089Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:13.3761532Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:15:13.3762010Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:13.3762468Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:13.3762943Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:15:13.3763341Z skip: Need at least 4 CUDA devices (2.411s) 2022-11-23T02:15:13.3763538Z 2022-11-23T02:15:13.3763921Z ---------------------------------------------------------------------- 2022-11-23T02:15:13.3764249Z Ran 5 tests in 13.811s 2022-11-23T02:15:13.3764414Z 2022-11-23T02:15:13.3764523Z OK (skipped=5) 2022-11-23T02:15:13.3764679Z 2022-11-23T02:15:13.3764804Z Generating XML reports... 2022-11-23T02:15:13.3765408Z Generated XML report: test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorOps-20221123021459.xml 2022-11-23T02:15:13.3766227Z Generated XML report: test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorReshard-20221123021459.xml 2022-11-23T02:15:13.3766606Z 2022-11-23T02:15:13.3766938Z ##[endgroup] 2022-11-23T02:15:13.3767537Z FINISHED PRINTING LOG FILE of distributed/_shard/test_partial_tensor (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-test_partial_tensor_q9145koy) 2022-11-23T02:15:13.3767986Z 2022-11-23T02:15:13.3768258Z Running distributed/_tensor/test_math_ops ... [2022-11-23 02:15:13.369367] 2022-11-23T02:15:13.3768942Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_tensor/test_math_ops.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:15:13.369769] 2022-11-23T02:15:29.5129256Z 2022-11-23T02:15:29.5130029Z Expand the folded group to see the log file of distributed/_tensor/test_math_ops 2022-11-23T02:15:29.5131069Z ##[group]PRINTING LOG FILE of distributed/_tensor/test_math_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_math_ops_g0vm_ls3) 2022-11-23T02:15:29.5131442Z 2022-11-23T02:15:29.5131556Z Running tests... 2022-11-23T02:15:29.5132110Z ---------------------------------------------------------------------- 2022-11-23T02:15:29.5132669Z Test results will be stored in test-reports/python-unittest/distributed._tensor.test_math_ops 2022-11-23T02:15:29.5133188Z test_softmax_fwd (__main__.DistMathOpsTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:29.5133795Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43308 2022-11-23T02:15:29.5134554Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43309 2022-11-23T02:15:29.5135534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:29.5136390Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:29.5137756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:29.5138434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:29.5139035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:29.5139498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:29.5140062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:29.5140545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:29.5140995Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:29.5141501Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:29.5141978Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:29.5142479Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:29.5143152Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:29.5143856Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:29.5144235Z ok (5.619s) 2022-11-23T02:15:29.5144911Z test_softmax_with_bwd (__main__.DistMathOpsTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43389 2022-11-23T02:15:29.5145458Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43390 2022-11-23T02:15:29.5146063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:29.5146520Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:29.5147105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:29.5147586Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:29.5148161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:29.5148725Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:29.5149820Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:29.5150290Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:29.5150773Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:29.5151273Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:29.5151769Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:29.5152260Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:29.5152910Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:29.5153613Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:29.5154015Z ok (4.112s) 2022-11-23T02:15:29.5154410Z test_sum (__main__.DistMathOpsTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43474 2022-11-23T02:15:29.5154913Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43475 2022-11-23T02:15:29.5155532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:29.5155987Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:29.5156550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:29.5157026Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:29.5157613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:29.5158050Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:29.5158632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:29.5159111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:29.5159556Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:29.5160039Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:29.5160532Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:29.5161022Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:29.5161687Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:29.5162372Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:29.5162774Z ok (4.011s) 2022-11-23T02:15:29.5162924Z 2022-11-23T02:15:29.5163305Z ---------------------------------------------------------------------- 2022-11-23T02:15:29.5163636Z Ran 3 tests in 13.743s 2022-11-23T02:15:29.5163802Z 2022-11-23T02:15:29.5163896Z OK 2022-11-23T02:15:29.5164031Z 2022-11-23T02:15:29.5164158Z Generating XML reports... 2022-11-23T02:15:29.5164750Z Generated XML report: test-reports/python-unittest/distributed._tensor.test_math_ops/TEST-DistMathOpsTest-20221123021515.xml 2022-11-23T02:15:29.5165078Z 2022-11-23T02:15:29.5165413Z ##[endgroup] 2022-11-23T02:15:29.5166004Z FINISHED PRINTING LOG FILE of distributed/_tensor/test_math_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_math_ops_g0vm_ls3) 2022-11-23T02:15:29.5166435Z 2022-11-23T02:15:29.5166738Z Running distributed/_tensor/parallel/test_tp_examples ... [2022-11-23 02:15:29.513054] 2022-11-23T02:15:29.5167454Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_tensor/parallel/test_tp_examples.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:15:29.513378] 2022-11-23T02:15:47.0602892Z 2022-11-23T02:15:47.0603609Z Expand the folded group to see the log file of distributed/_tensor/parallel/test_tp_examples 2022-11-23T02:15:47.0604724Z ##[group]PRINTING LOG FILE of distributed/_tensor/parallel/test_tp_examples (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-parallel-test_tp_examples_t3v94amr) 2022-11-23T02:15:47.0605160Z 2022-11-23T02:15:47.0605276Z Running tests... 2022-11-23T02:15:47.0606148Z ---------------------------------------------------------------------- 2022-11-23T02:15:47.0606831Z Test results will be stored in test-reports/python-unittest/distributed._tensor.parallel.test_tp_examples 2022-11-23T02:15:47.0607426Z test_mlp_megatron_e2e (__main__.DistTensorParallelExampleTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:15:47.0607946Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43590 2022-11-23T02:15:47.0608692Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43591 2022-11-23T02:15:47.0609470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:47.0609937Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:47.0610519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:47.0610986Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:47.0611577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:47.0612033Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:47.0612622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:47.0613078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:47.0613534Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:47.0614037Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:47.0614515Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:47.0615011Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:47.0615689Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:47.0616386Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:47.0616773Z ok (6.145s) 2022-11-23T02:15:47.0617241Z test_self_attn_megatron_e2e (__main__.DistTensorParallelExampleTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43675 2022-11-23T02:15:47.0618061Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43676 2022-11-23T02:15:47.0618711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:47.0619155Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:47.0619736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:47.0620247Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:47.0620832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:47.0621368Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:47.0621947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:47.0622426Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:47.0622872Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:47.0623348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:47.0623841Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:47.0624334Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:47.0624983Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:47.0625690Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:47.0626092Z ok (4.516s) 2022-11-23T02:15:47.0626583Z test_self_attn_replacement_megatron_e2e (__main__.DistTensorParallelExampleTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43760 2022-11-23T02:15:47.0627147Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43761 2022-11-23T02:15:47.0627760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:47.0628220Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:47.0628798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:47.0629599Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:47.0630198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:15:47.0630657Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:15:47.0631224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:15:47.0631696Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:15:47.0632139Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:15:47.0632638Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:15:47.0633115Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:15:47.0633605Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:15:47.0634275Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:47.0634960Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:15:47.0635358Z ok (4.515s) 2022-11-23T02:15:47.0635508Z 2022-11-23T02:15:47.0635881Z ---------------------------------------------------------------------- 2022-11-23T02:15:47.0636221Z Ran 3 tests in 15.177s 2022-11-23T02:15:47.0636385Z 2022-11-23T02:15:47.0636461Z OK 2022-11-23T02:15:47.0636594Z 2022-11-23T02:15:47.0636719Z Generating XML reports... 2022-11-23T02:15:47.0637399Z Generated XML report: test-reports/python-unittest/distributed._tensor.parallel.test_tp_examples/TEST-DistTensorParallelExampleTest-20221123021531.xml 2022-11-23T02:15:47.0637813Z 2022-11-23T02:15:47.0638125Z ##[endgroup] 2022-11-23T02:15:47.0638785Z FINISHED PRINTING LOG FILE of distributed/_tensor/parallel/test_tp_examples (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-parallel-test_tp_examples_t3v94amr) 2022-11-23T02:15:47.0639288Z 2022-11-23T02:15:47.0639561Z Running distributed/fsdp/test_fsdp_memory ... [2022-11-23 02:15:47.060417] 2022-11-23T02:15:47.0640246Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_memory.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:15:47.060722] 2022-11-23T02:16:05.8809006Z 2022-11-23T02:16:05.8809722Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_memory 2022-11-23T02:16:05.8811010Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_memory (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_memory_esnbdd1z) 2022-11-23T02:16:05.8811404Z 2022-11-23T02:16:05.8811522Z Running tests... 2022-11-23T02:16:05.8812045Z ---------------------------------------------------------------------- 2022-11-23T02:16:05.8812619Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_memory 2022-11-23T02:16:05.8813139Z test_fsdp_memory_ckpt_ckpt (__main__.TestFSDPMemory) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:16:05.8813635Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43880 2022-11-23T02:16:05.8814099Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43881 2022-11-23T02:16:05.8814750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:05.8815196Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:05.8815787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:05.8816663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:05.8817583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:05.8818160Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:05.8819123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:05.8820117Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:05.8821131Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:16:05.8822110Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:16:05.8823270Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:05.8823983Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:05.8824524Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:05.8824992Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:05.8826526Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:16:05.8827359Z warnings.warn( 2022-11-23T02:16:05.8828541Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:16:05.8829789Z warnings.warn( 2022-11-23T02:16:05.8830031Z dist init r=1, world=2 2022-11-23T02:16:05.8830287Z dist init r=0, world=2 2022-11-23T02:16:05.8830530Z ok (9.268s) 2022-11-23T02:16:05.8830974Z test_fsdp_memory_ckpt_no_ckpt (__main__.TestFSDPMemory) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 43993 2022-11-23T02:16:05.8831493Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 43994 2022-11-23T02:16:05.8832131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:05.8832594Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:05.8833159Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:05.8833643Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:05.8834241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:05.8834695Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:05.8835262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:05.8835741Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:05.8836206Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:16:05.8836697Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:16:05.8837372Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:05.8838070Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:05.8838609Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:05.8839074Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:05.8840360Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:16:05.8841165Z warnings.warn( 2022-11-23T02:16:05.8842342Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:16:05.8843124Z warnings.warn( 2022-11-23T02:16:05.8843455Z dist init r=0, world=2 2022-11-23T02:16:05.8843723Z dist init r=1, world=2 2022-11-23T02:16:05.8843967Z ok (7.218s) 2022-11-23T02:16:05.8844119Z 2022-11-23T02:16:05.8844394Z ---------------------------------------------------------------------- 2022-11-23T02:16:05.8844712Z Ran 2 tests in 16.486s 2022-11-23T02:16:05.8844879Z 2022-11-23T02:16:05.8844973Z OK 2022-11-23T02:16:05.8845108Z 2022-11-23T02:16:05.8845235Z Generating XML reports... 2022-11-23T02:16:05.8845809Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_memory/TEST-TestFSDPMemory-20221123021548.xml 2022-11-23T02:16:05.8846157Z 2022-11-23T02:16:05.8846512Z ##[endgroup] 2022-11-23T02:16:05.8847117Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_memory (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_memory_esnbdd1z) 2022-11-23T02:16:05.8847563Z 2022-11-23T02:16:05.8847831Z Running distributed/_tensor/test_pointwise_ops ... [2022-11-23 02:16:05.881025] 2022-11-23T02:16:05.8848542Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_tensor/test_pointwise_ops.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:16:05.881333] 2022-11-23T02:16:25.9398568Z 2022-11-23T02:16:25.9399010Z Expand the folded group to see the log file of distributed/_tensor/test_pointwise_ops 2022-11-23T02:16:25.9401985Z ##[group]PRINTING LOG FILE of distributed/_tensor/test_pointwise_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_pointwise_ops_1w6cn93g) 2022-11-23T02:16:25.9402379Z 2022-11-23T02:16:25.9402493Z Running tests... 2022-11-23T02:16:25.9403391Z ---------------------------------------------------------------------- 2022-11-23T02:16:25.9403984Z Test results will be stored in test-reports/python-unittest/distributed._tensor.test_pointwise_ops 2022-11-23T02:16:25.9404511Z test_activations (__main__.DistElementwiseOpsTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:16:25.9404990Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44141 2022-11-23T02:16:25.9405646Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44142 2022-11-23T02:16:25.9406567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:25.9407019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:25.9409623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:25.9410142Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:25.9410741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:25.9411192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:25.9411772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:25.9412270Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:25.9412717Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:25.9413202Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:16:25.9413692Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:25.9414193Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:16:25.9414865Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:25.9415559Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:25.9415988Z ok (5.631s) 2022-11-23T02:16:25.9416671Z test_dropout (__main__.DistElementwiseOpsTest) ... skip: testing RNG based ops is broken: https://github.com/pytorch/tau/issues/494 (0.001s) 2022-11-23T02:16:25.9417299Z test_dropout_backward (__main__.DistElementwiseOpsTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44222 2022-11-23T02:16:25.9417833Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44223 2022-11-23T02:16:25.9418461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:25.9418911Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:25.9419480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:25.9420060Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:25.9420677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:25.9421133Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:25.9421699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:25.9422178Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:25.9422622Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:25.9423112Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:16:25.9423591Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:25.9424087Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:16:25.9424754Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:25.9425454Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:25.9425833Z ok (4.111s) 2022-11-23T02:16:25.9426268Z test_dropout_errors (__main__.DistElementwiseOpsTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44303 2022-11-23T02:16:25.9426796Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44304 2022-11-23T02:16:25.9427394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:25.9427848Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:25.9428422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:25.9428903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:25.9430094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:25.9430675Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:25.9431261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:25.9431738Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:25.9432165Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:25.9432661Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:16:25.9433152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:25.9433626Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:16:25.9434296Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:25.9435232Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:25.9435842Z ok (4.011s) 2022-11-23T02:16:25.9436254Z test_mul_out (__main__.DistElementwiseOpsTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44382 2022-11-23T02:16:25.9436775Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44383 2022-11-23T02:16:25.9437403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:25.9437846Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:25.9438533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:25.9439009Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:25.9439597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:25.9440032Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:25.9440609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:25.9441075Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:25.9441518Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:25.9441994Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:16:25.9442492Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:25.9442988Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:16:25.9443639Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:25.9444338Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:25.9444742Z ok (3.911s) 2022-11-23T02:16:25.9444895Z 2022-11-23T02:16:25.9445168Z ---------------------------------------------------------------------- 2022-11-23T02:16:25.9445483Z Ran 5 tests in 17.666s 2022-11-23T02:16:25.9445648Z 2022-11-23T02:16:25.9445755Z OK (skipped=1) 2022-11-23T02:16:25.9445906Z 2022-11-23T02:16:25.9446112Z Generating XML reports... 2022-11-23T02:16:25.9447265Z Generated XML report: test-reports/python-unittest/distributed._tensor.test_pointwise_ops/TEST-DistElementwiseOpsTest-20221123021607.xml 2022-11-23T02:16:25.9447657Z 2022-11-23T02:16:25.9448014Z ##[endgroup] 2022-11-23T02:16:25.9448636Z FINISHED PRINTING LOG FILE of distributed/_tensor/test_pointwise_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_pointwise_ops_1w6cn93g) 2022-11-23T02:16:25.9449000Z 2022-11-23T02:16:25.9449291Z Running distributed/fsdp/test_fsdp_tp_integration ... [2022-11-23 02:16:25.940005] 2022-11-23T02:16:25.9449991Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_tp_integration.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:16:25.940313] 2022-11-23T02:16:49.4682690Z 2022-11-23T02:16:49.4685267Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_tp_integration 2022-11-23T02:16:49.4686274Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_tp_integration (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_tp_integration_zbi7ygfz) 2022-11-23T02:16:49.4686698Z 2022-11-23T02:16:49.4686810Z Running tests... 2022-11-23T02:16:49.4688368Z ---------------------------------------------------------------------- 2022-11-23T02:16:49.4689297Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_tp_integration 2022-11-23T02:16:49.4690659Z test_fsdp_tp_checkpoint_integration (__main__.TestTPFSDPIntegration) 2022-11-23T02:16:49.4691271Z Tests checkpointing for TP + FSDP integration. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:16:49.4691777Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44496 2022-11-23T02:16:49.4692218Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44497 2022-11-23T02:16:49.4692875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:49.4693345Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:49.4693913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:49.4695947Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:49.4696726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:49.4697226Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:49.4697825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:49.4698303Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:49.4698766Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:16:49.4699278Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:16:49.4703634Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:49.4704514Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:49.4705445Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:49.4706458Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:49.4707152Z dist init r=0, world=2 2022-11-23T02:16:49.4707609Z dist init r=1, world=2 2022-11-23T02:16:49.4708113Z skip: Need at least 4 CUDA devices (5.589s) 2022-11-23T02:16:49.4709576Z test_fsdp_tp_integration_tensor_parallel_size_2_cpu_offload_CPUOffload(offload_params=False) (__main__.TestTPFSDPIntegration) 2022-11-23T02:16:49.4710686Z Tests training for TP + FSDP integration by comparing an FSDP-only ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44575 2022-11-23T02:16:49.4711248Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44576 2022-11-23T02:16:49.4711879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:49.4712349Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:49.4712926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:49.4713406Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:49.4713998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:49.4714496Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:49.4715065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:49.4715541Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:49.4716014Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:16:49.4716524Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:16:49.4717357Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:49.4718092Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:49.4718628Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:49.4719111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:49.4719457Z dist init r=0, world=2 2022-11-23T02:16:49.4719713Z dist init r=1, world=2 2022-11-23T02:16:49.4720007Z skip: Need at least 4 CUDA devices (3.812s) 2022-11-23T02:16:49.4720458Z test_fsdp_tp_integration_tensor_parallel_size_2_cpu_offload_CPUOffload(offload_params=True) (__main__.TestTPFSDPIntegration) 2022-11-23T02:16:49.4721311Z Tests training for TP + FSDP integration by comparing an FSDP-only ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44654 2022-11-23T02:16:49.4721868Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44655 2022-11-23T02:16:49.4722484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:49.4722926Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:49.4723512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:49.4723991Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:49.4724562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:49.4725023Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:49.4725604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:49.4726084Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:49.4726527Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:16:49.4727030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:16:49.4727699Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:49.4728380Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:49.4728909Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:49.4729395Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:49.4729754Z dist init r=0, world=2 2022-11-23T02:16:49.4729990Z dist init r=1, world=2 2022-11-23T02:16:49.4730281Z skip: Need at least 4 CUDA devices (3.913s) 2022-11-23T02:16:49.4730754Z test_fsdp_tp_integration_tensor_parallel_size_4_cpu_offload_CPUOffload(offload_params=False) (__main__.TestTPFSDPIntegration) 2022-11-23T02:16:49.4731493Z Tests training for TP + FSDP integration by comparing an FSDP-only ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44733 2022-11-23T02:16:49.4732038Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44734 2022-11-23T02:16:49.4732654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:49.4733109Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:49.4733677Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:49.4734152Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:49.4734802Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:49.4735261Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:49.4735831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:49.4736304Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:49.4736767Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:16:49.4737254Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:16:49.4737918Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:49.4738751Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:49.4739287Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:49.4739751Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:49.4740104Z dist init r=0, world=2 2022-11-23T02:16:49.4740357Z dist init r=1, world=2 2022-11-23T02:16:49.4740629Z skip: Need at least 4 CUDA devices (3.913s) 2022-11-23T02:16:49.4741096Z test_fsdp_tp_integration_tensor_parallel_size_4_cpu_offload_CPUOffload(offload_params=True) (__main__.TestTPFSDPIntegration) 2022-11-23T02:16:49.4742119Z Tests training for TP + FSDP integration by comparing an FSDP-only ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44812 2022-11-23T02:16:49.4742676Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44813 2022-11-23T02:16:49.4743278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:49.4743731Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:49.4744316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:49.4744793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:49.4745359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:16:49.4745813Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:16:49.4746389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:16:49.4746842Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:16:49.4747310Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:16:49.4747819Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:16:49.4748487Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:49.4749711Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:16:49.4750259Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:16:49.4750738Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:16:49.4751094Z dist init r=1, world=2 2022-11-23T02:16:49.4751328Z dist init r=0, world=2 2022-11-23T02:16:49.4751618Z skip: Need at least 4 CUDA devices (3.912s) 2022-11-23T02:16:49.4751818Z 2022-11-23T02:16:49.4752097Z ---------------------------------------------------------------------- 2022-11-23T02:16:49.4752414Z Ran 5 tests in 21.140s 2022-11-23T02:16:49.4752575Z 2022-11-23T02:16:49.4752685Z OK (skipped=5) 2022-11-23T02:16:49.4752842Z 2022-11-23T02:16:49.4753068Z Generating XML reports... 2022-11-23T02:16:49.4753703Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_tp_integration/TEST-TestTPFSDPIntegration-20221123021627.xml 2022-11-23T02:16:49.4754088Z 2022-11-23T02:16:49.4754543Z ##[endgroup] 2022-11-23T02:16:49.4755193Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_tp_integration (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_tp_integration_zbi7ygfz) 2022-11-23T02:16:49.4755578Z 2022-11-23T02:16:49.4755869Z Running distributed/fsdp/test_fsdp_clip_grad_norm ... [2022-11-23 02:16:49.468330] 2022-11-23T02:16:49.4756560Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_clip_grad_norm.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:16:49.468602] 2022-11-23T02:17:19.1434523Z 2022-11-23T02:17:19.1435043Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_clip_grad_norm 2022-11-23T02:17:19.1439180Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_clip_grad_norm (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_clip_grad_norm_jg1vpt4x) 2022-11-23T02:17:19.1439726Z 2022-11-23T02:17:19.1439851Z Running tests... 2022-11-23T02:17:19.1440591Z ---------------------------------------------------------------------- 2022-11-23T02:17:19.1441218Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_clip_grad_norm 2022-11-23T02:17:19.1441939Z test_ddp_parity (__main__.TestClipGradNorm) 2022-11-23T02:17:19.1442435Z Tests FSDP with ``FullyShardedDataParallel.clip_grad_norm_()` against ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:17:19.1442963Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 44926 2022-11-23T02:17:19.1443416Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 44927 2022-11-23T02:17:19.1444062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:19.1444543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:19.1445136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:19.1445601Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:19.1446242Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:19.1446699Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:19.1447284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:19.1447754Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:19.1448219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:17:19.1448902Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:17:19.1449433Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:17:19.1450100Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:17:19.1450629Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:17:19.1451119Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:17:19.1452852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1453375Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1453875Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1454599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1455090Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1455587Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1456674Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1458031Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1458919Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1459729Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1461758Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1463415Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1464675Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1466885Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1469898Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1470928Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1471415Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1471908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1472396Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1472878Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1473346Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1473891Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1474377Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1474864Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1475344Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1475951Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1476449Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1476932Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1477392Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1477868Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1478344Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1479460Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1480196Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1480680Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1481156Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1481632Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1482096Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1482569Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1483053Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1483510Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1483993Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1484510Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1484987Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1485447Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1486461Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1487212Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1487693Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1488158Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1488636Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1489115Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1489593Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1490055Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1490529Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1491537Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1492856Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1494123Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1495364Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1496680Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1497422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1497914Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1498394Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1498863Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1499871Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1500603Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1501086Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1501552Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1502029Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1502509Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1502989Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1503468Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1503924Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1504404Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1504882Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1505357Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1505823Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1506825Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1507563Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1508029Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1508564Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1509619Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1510104Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1510585Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1511064Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1511545Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1512648Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1513902Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1515148Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1516408Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1517664Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:17:19.1518402Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1518893Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1519361Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1519852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:17:19.1520214Z dist init r=0, world=2 2022-11-23T02:17:19.1520467Z dist init r=1, world=2 2022-11-23T02:17:19.1520690Z ok (22.904s) 2022-11-23T02:17:19.1520973Z test_non_root (__main__.TestClipGradNorm) 2022-11-23T02:17:19.1521615Z Tests that calling ``clip_grad_norm_()`` on a non-root FSDP instance ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45009 2022-11-23T02:17:19.1522147Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45010 2022-11-23T02:17:19.1522770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:19.1523230Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:19.1523812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:19.1524277Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:19.1524865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:19.1525322Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:19.1525962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:19.1526447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:19.1526912Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:17:19.1527416Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:17:19.1528070Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:17:19.1528766Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:17:19.1529374Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:17:19.1529860Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:17:19.1530203Z dist init r=1, world=2 2022-11-23T02:17:19.1530455Z dist init r=0, world=2 2022-11-23T02:17:19.1530696Z ok (4.413s) 2022-11-23T02:17:19.1530829Z 2022-11-23T02:17:19.1531105Z ---------------------------------------------------------------------- 2022-11-23T02:17:19.1531441Z Ran 2 tests in 27.318s 2022-11-23T02:17:19.1531605Z 2022-11-23T02:17:19.1531701Z OK 2022-11-23T02:17:19.1531835Z 2022-11-23T02:17:19.1531942Z Generating XML reports... 2022-11-23T02:17:19.1532563Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_clip_grad_norm/TEST-TestClipGradNorm-20221123021651.xml 2022-11-23T02:17:19.1532930Z 2022-11-23T02:17:19.1533332Z ##[endgroup] 2022-11-23T02:17:19.1533951Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_clip_grad_norm (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_clip_grad_norm_jg1vpt4x) 2022-11-23T02:17:19.1534332Z 2022-11-23T02:17:19.1534651Z Running distributed/_shard/sharded_tensor/ops/test_matrix_ops ... [2022-11-23 02:17:19.143514] 2022-11-23T02:17:19.1535395Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_matrix_ops.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:17:19.143788] 2022-11-23T02:17:49.6312259Z 2022-11-23T02:17:49.6312796Z Expand the folded group to see the log file of distributed/_shard/sharded_tensor/ops/test_matrix_ops 2022-11-23T02:17:49.6317012Z ##[group]PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_matrix_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_matrix_ops_rd_22hrb) 2022-11-23T02:17:49.6317492Z 2022-11-23T02:17:49.6317591Z Running tests... 2022-11-23T02:17:49.6319768Z ---------------------------------------------------------------------- 2022-11-23T02:17:49.6320414Z Test results will be stored in test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_matrix_ops 2022-11-23T02:17:49.6321024Z test_sharded_tensor_contiguous (__main__.TestShardedTensorMatrixOps) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:17:49.6321548Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45127 2022-11-23T02:17:49.6321994Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45128 2022-11-23T02:17:49.6322452Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 45129 2022-11-23T02:17:49.6322899Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 45130 2022-11-23T02:17:49.6323543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6324633Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6325252Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6325961Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6326580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6327025Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6327610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6328091Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6328696Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6329135Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6329830Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6330316Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6330890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6331349Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6331934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6332407Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6332835Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:17:49.6333328Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:17:49.6333811Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:17:49.6334288Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:17:49.6334675Z skip: Need at least 4 CUDA devices (4.013s) 2022-11-23T02:17:49.6335199Z test_sharded_tensor_layer_norm (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45263 2022-11-23T02:17:49.6335768Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45264 2022-11-23T02:17:49.6336224Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 45265 2022-11-23T02:17:49.6336663Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 45266 2022-11-23T02:17:49.6337292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6337759Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6338330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6338814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6339412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6339867Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6340446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6340920Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6341607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6342061Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6342642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6343123Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6343748Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6344209Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6344790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6345247Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6345693Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:17:49.6346178Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:17:49.6346654Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:17:49.6347178Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:17:49.6347572Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:17:49.6348091Z test_sharded_tensor_layer_norm_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45399 2022-11-23T02:17:49.6348639Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45400 2022-11-23T02:17:49.6349475Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 45401 2022-11-23T02:17:49.6349933Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 45402 2022-11-23T02:17:49.6350563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6351002Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6351591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6352065Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6352657Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6353091Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6353671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6354142Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6354709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6355159Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6355737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6356211Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6356780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6357228Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6357804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6358256Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6358694Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:17:49.6359173Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:17:49.6359648Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:17:49.6360108Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:17:49.6360496Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:17:49.6361099Z test_sharded_tensor_masked_fill (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45535 2022-11-23T02:17:49.6361677Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45536 2022-11-23T02:17:49.6362116Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 45537 2022-11-23T02:17:49.6362568Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 45538 2022-11-23T02:17:49.6363189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6363627Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6364204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6364757Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6365348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6365780Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6366361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6366832Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6367396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6367859Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6368422Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6368897Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6369476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6369909Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6370487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6370952Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6371396Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:17:49.6371861Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:17:49.6372328Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:17:49.6372801Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:17:49.6373183Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:17:49.6373707Z test_sharded_tensor_masked_fill_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45671 2022-11-23T02:17:49.6374278Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45672 2022-11-23T02:17:49.6374732Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 45673 2022-11-23T02:17:49.6375165Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 45674 2022-11-23T02:17:49.6375778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6376233Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6376808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6377270Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6377851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6378370Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6378947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6379430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6380021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6380471Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6381029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6381542Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6382124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6382607Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6383186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6383659Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6384146Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:17:49.6384636Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:17:49.6385096Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:17:49.6385570Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:17:49.6385978Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:17:49.6386468Z test_sharded_tensor_softmax (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45807 2022-11-23T02:17:49.6387029Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45808 2022-11-23T02:17:49.6387482Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 45809 2022-11-23T02:17:49.6387931Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 45810 2022-11-23T02:17:49.6388536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6389291Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6389895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6390359Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6390940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6391386Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6391952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6392388Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6392968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6393445Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6394019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6394493Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6395081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6395528Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6396165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6396656Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6397106Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:17:49.6397591Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:17:49.6398046Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:17:49.6398510Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:17:49.6398984Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:17:49.6399478Z test_sharded_tensor_transpose (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 45943 2022-11-23T02:17:49.6400042Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 45944 2022-11-23T02:17:49.6400502Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 45945 2022-11-23T02:17:49.6400946Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 45946 2022-11-23T02:17:49.6401549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6402009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6402596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6403051Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6403641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6404090Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6404667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6405118Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6405698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6406145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6406715Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6407164Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6407752Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6408196Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6408754Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6409216Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6409658Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:17:49.6410134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:17:49.6410588Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:17:49.6411050Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:17:49.6411444Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:17:49.6411949Z test_sharded_tensor_transpose_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46079 2022-11-23T02:17:49.6412515Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46080 2022-11-23T02:17:49.6413020Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 46081 2022-11-23T02:17:49.6413484Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 46082 2022-11-23T02:17:49.6414119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6414573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6415169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6415653Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6416285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6416764Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6417353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6417811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6418397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6418841Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6419419Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6419873Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6420454Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6420925Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6421488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6421967Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6422404Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:17:49.6422880Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:17:49.6423336Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:17:49.6423798Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:17:49.6424191Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:17:49.6424681Z test_sharded_tensor_type_as (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46215 2022-11-23T02:17:49.6425247Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46216 2022-11-23T02:17:49.6425701Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 46217 2022-11-23T02:17:49.6426149Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 46218 2022-11-23T02:17:49.6426747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6427207Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6427792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6428268Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6428857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6429702Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6430292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6430843Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6431449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6431902Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6432483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6432935Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6433517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6434041Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6434616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6435074Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6435528Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:17:49.6436012Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:17:49.6436466Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:17:49.6436928Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:17:49.6437324Z skip: Need at least 4 CUDA devices (2.411s) 2022-11-23T02:17:49.6437826Z test_sharded_tensor_view (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46351 2022-11-23T02:17:49.6438365Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46352 2022-11-23T02:17:49.6438818Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 46353 2022-11-23T02:17:49.6439270Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 46354 2022-11-23T02:17:49.6439869Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6440329Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6440910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6441378Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6441946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6442402Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6442977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6455966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6456614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6457086Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6457680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6458177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6458776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6459239Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6459825Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6460288Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6460848Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:17:49.6461347Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:17:49.6461835Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:17:49.6462294Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:17:49.6462687Z skip: Need at least 4 CUDA devices (2.510s) 2022-11-23T02:17:49.6463197Z test_sharded_tensor_view_error (__main__.TestShardedTensorMatrixOps) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46487 2022-11-23T02:17:49.6463818Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46488 2022-11-23T02:17:49.6464280Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 46489 2022-11-23T02:17:49.6464750Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 46490 2022-11-23T02:17:49.6465386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6465827Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6466412Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6466896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6467470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6467930Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6468511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6469370Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6469962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6470417Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6470999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6471491Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6472056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:17:49.6472507Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:17:49.6473090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:17:49.6473546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:17:49.6473995Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:17:49.6474479Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:17:49.6474963Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:17:49.6475422Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:17:49.6475814Z skip: Need at least 4 CUDA devices (2.410s) 2022-11-23T02:17:49.6476013Z 2022-11-23T02:17:49.6476291Z ---------------------------------------------------------------------- 2022-11-23T02:17:49.6476604Z Ran 11 tests in 28.216s 2022-11-23T02:17:49.6476777Z 2022-11-23T02:17:49.6476887Z OK (skipped=11) 2022-11-23T02:17:49.6477046Z 2022-11-23T02:17:49.6477171Z Generating XML reports... 2022-11-23T02:17:49.6477851Z Generated XML report: test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_matrix_ops/TEST-TestShardedTensorMatrixOps-20221123021721.xml 2022-11-23T02:17:49.6478373Z 2022-11-23T02:17:49.6478856Z ##[endgroup] 2022-11-23T02:17:49.6479551Z FINISHED PRINTING LOG FILE of distributed/_shard/sharded_tensor/ops/test_matrix_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_shard-sharded_tensor-ops-test_matrix_ops_rd_22hrb) 2022-11-23T02:17:49.6479964Z 2022-11-23T02:17:49.6480232Z Running distributed/test_c10d_spawn_ucc ... [2022-11-23 02:17:49.631506] 2022-11-23T02:17:49.6480945Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_spawn_ucc.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:17:49.631780] 2022-11-23T02:18:45.4525480Z 2022-11-23T02:18:45.4525976Z Expand the folded group to see the log file of distributed/test_c10d_spawn_ucc 2022-11-23T02:18:45.4528055Z ##[group]PRINTING LOG FILE of distributed/test_c10d_spawn_ucc (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_spawn_ucc_bt2_z199) 2022-11-23T02:18:45.4528823Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3l__u95w 2022-11-23T02:18:45.4529657Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3l__u95w/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4530556Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4531033Z 2022-11-23T02:18:45.4531344Z 2022-11-23T02:18:45.4536811Z , <__main__.TestDistributedNNFunctionsUcc testMethod=test_all_to_all>, <__main__.TestDistributedNNFunctionsUcc testMethod=test_all_to_all_single>, <__main__.TestDistributedNNFunctionsUcc testMethod=test_allreduce>, <__main__.TestDistributedNNFunctionsUcc testMethod=test_broadcast>, <__main__.TestDistributedNNFunctionsUcc testMethod=test_reduce>]> 2022-11-23T02:18:45.4538420Z test_all_gather (__main__.TestDistributedNNFunctionsUcc) 2022-11-23T02:18:45.4539024Z test_all_to_all (__main__.TestDistributedNNFunctionsUcc) 2022-11-23T02:18:45.4539632Z test_all_to_all_single (__main__.TestDistributedNNFunctionsUcc) 2022-11-23T02:18:45.4540395Z test_allreduce (__main__.TestDistributedNNFunctionsUcc) 2022-11-23T02:18:45.4541120Z test_broadcast (__main__.TestDistributedNNFunctionsUcc) 2022-11-23T02:18:45.4541864Z test_reduce (__main__.TestDistributedNNFunctionsUcc) 2022-11-23T02:18:45.4543026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4545823Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4546606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4547129Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4547616Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbzqa3ubs 2022-11-23T02:18:45.4548163Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbzqa3ubs/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4548606Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4548808Z 2022-11-23T02:18:45.4548919Z Running tests... 2022-11-23T02:18:45.4549735Z ---------------------------------------------------------------------- 2022-11-23T02:18:45.4550270Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_ucc 2022-11-23T02:18:45.4550860Z test_all_gather (__main__.TestDistributedNNFunctionsUcc) ... skip: runs into illegal memory access on first assertEqual check when run locally (0.000s) 2022-11-23T02:18:45.4551226Z 2022-11-23T02:18:45.4551495Z ---------------------------------------------------------------------- 2022-11-23T02:18:45.4551836Z Ran 1 test in 0.001s 2022-11-23T02:18:45.4551982Z 2022-11-23T02:18:45.4552096Z OK (skipped=1) 2022-11-23T02:18:45.4552268Z 2022-11-23T02:18:45.4552396Z Generating XML reports... 2022-11-23T02:18:45.4553250Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_ucc/TEST-TestDistributedNNFunctionsUcc-20221123021756.xml 2022-11-23T02:18:45.4554044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4554492Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4555084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4555569Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4556036Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp78lrs37y 2022-11-23T02:18:45.4556694Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp78lrs37y/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4557139Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4557340Z 2022-11-23T02:18:45.4557453Z Running tests... 2022-11-23T02:18:45.4557857Z ---------------------------------------------------------------------- 2022-11-23T02:18:45.4558407Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_ucc 2022-11-23T02:18:45.4558989Z test_all_to_all (__main__.TestDistributedNNFunctionsUcc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46726 2022-11-23T02:18:45.4559518Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46727 2022-11-23T02:18:45.4560142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4560598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4561185Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4561649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4562241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4562789Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4563378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4563854Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4564330Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpymui44ft 2022-11-23T02:18:45.4564862Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpymui44ft/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4565403Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoblxnshe 2022-11-23T02:18:45.4565949Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoblxnshe/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4566367Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4566698Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4567111Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:45.4567594Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:18:45.4568088Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:45.4568580Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:18:45.4569256Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:45.4569941Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:45.4570520Z [1669169885.150391] [08317a7e7676:46726:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:18:45.4571105Z [1669169885.177811] [08317a7e7676:46726:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:18:45.4571596Z [1669169885.177811] [08317a7e7676:46726:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:18:45.4572097Z [1669169885.153818] [08317a7e7676:46727:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:18:45.4572599Z [1669169885.177756] [08317a7e7676:46727:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:18:45.4573068Z [1669169885.177756] [08317a7e7676:46727:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:18:45.4573475Z ok (5.884s) 2022-11-23T02:18:45.4573624Z 2022-11-23T02:18:45.4573884Z ---------------------------------------------------------------------- 2022-11-23T02:18:45.4574216Z Ran 1 test in 5.884s 2022-11-23T02:18:45.4574382Z 2022-11-23T02:18:45.4574475Z OK 2022-11-23T02:18:45.4574612Z 2022-11-23T02:18:45.4574736Z Generating XML reports... 2022-11-23T02:18:45.4575367Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_ucc/TEST-TestDistributedNNFunctionsUcc-20221123021800.xml 2022-11-23T02:18:45.4576115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4576572Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4577140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4577621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4578094Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaornjjqs 2022-11-23T02:18:45.4578649Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaornjjqs/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4579064Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4579264Z 2022-11-23T02:18:45.4579372Z Running tests... 2022-11-23T02:18:45.4579784Z ---------------------------------------------------------------------- 2022-11-23T02:18:45.4580310Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_ucc 2022-11-23T02:18:45.4580898Z test_all_to_all_single (__main__.TestDistributedNNFunctionsUcc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46847 2022-11-23T02:18:45.4581460Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46848 2022-11-23T02:18:45.4582078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4582522Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4583112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4583645Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4584223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4584676Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4585254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4585727Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4586183Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwb4cidy7 2022-11-23T02:18:45.4586735Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwb4cidy7/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4587279Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp39h5e94t 2022-11-23T02:18:45.4587884Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp39h5e94t/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4588308Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4588724Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:45.4589612Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:18:45.4590009Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4590418Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:45.4590910Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:18:45.4591688Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:45.4592381Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:45.4592955Z [1669169894.712371] [08317a7e7676:46847:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:18:45.4593470Z [1669169894.730976] [08317a7e7676:46847:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:18:45.4593942Z [1669169894.730976] [08317a7e7676:46847:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:18:45.4594441Z [1669169894.714697] [08317a7e7676:46848:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:18:45.4594947Z [1669169894.732496] [08317a7e7676:46848:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:18:45.4595414Z [1669169894.732496] [08317a7e7676:46848:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:18:45.4595758Z ok (5.680s) 2022-11-23T02:18:45.4595889Z 2022-11-23T02:18:45.4596166Z ---------------------------------------------------------------------- 2022-11-23T02:18:45.4596496Z Ran 1 test in 5.680s 2022-11-23T02:18:45.4596661Z 2022-11-23T02:18:45.4596753Z OK 2022-11-23T02:18:45.4596888Z 2022-11-23T02:18:45.4596993Z Generating XML reports... 2022-11-23T02:18:45.4597642Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_ucc/TEST-TestDistributedNNFunctionsUcc-20221123021810.xml 2022-11-23T02:18:45.4598389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4598847Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4599418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4599899Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4600370Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1qmlktkd 2022-11-23T02:18:45.4600898Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1qmlktkd/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4601330Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4601531Z 2022-11-23T02:18:45.4601641Z Running tests... 2022-11-23T02:18:45.4602051Z ---------------------------------------------------------------------- 2022-11-23T02:18:45.4602579Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_ucc 2022-11-23T02:18:45.4603161Z test_allreduce (__main__.TestDistributedNNFunctionsUcc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 46968 2022-11-23T02:18:45.4603717Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 46969 2022-11-23T02:18:45.4604398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4604873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4605460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4605944Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4606512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4606964Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4607548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4608085Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4608535Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5h9a_494 2022-11-23T02:18:45.4609080Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5h9a_494/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4609615Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxdyojw31 2022-11-23T02:18:45.4610137Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxdyojw31/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4610567Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4610982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:45.4611479Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:18:45.4611872Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4612280Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:45.4612774Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:18:45.4613432Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:45.4614134Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:45.4614705Z [1669169904.292047] [08317a7e7676:46969:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:18:45.4615215Z [1669169904.307994] [08317a7e7676:46969:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:18:45.4615672Z [1669169904.307994] [08317a7e7676:46969:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:18:45.4616190Z [1669169904.284856] [08317a7e7676:46968:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:18:45.4616694Z [1669169904.303787] [08317a7e7676:46968:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:18:45.4617160Z [1669169904.303787] [08317a7e7676:46968:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:18:45.4617486Z ok (5.682s) 2022-11-23T02:18:45.4617634Z 2022-11-23T02:18:45.4617913Z ---------------------------------------------------------------------- 2022-11-23T02:18:45.4618243Z Ran 1 test in 5.682s 2022-11-23T02:18:45.4618409Z 2022-11-23T02:18:45.4618503Z OK 2022-11-23T02:18:45.4618618Z 2022-11-23T02:18:45.4618744Z Generating XML reports... 2022-11-23T02:18:45.4619390Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_ucc/TEST-TestDistributedNNFunctionsUcc-20221123021819.xml 2022-11-23T02:18:45.4620138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4620638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4621240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4621720Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4622194Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpua6ebmqz 2022-11-23T02:18:45.4622725Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpua6ebmqz/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4623162Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4623362Z 2022-11-23T02:18:45.4623471Z Running tests... 2022-11-23T02:18:45.4623866Z ---------------------------------------------------------------------- 2022-11-23T02:18:45.4624476Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_ucc 2022-11-23T02:18:45.4625065Z test_broadcast (__main__.TestDistributedNNFunctionsUcc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47087 2022-11-23T02:18:45.4625618Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47088 2022-11-23T02:18:45.4626216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4626669Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4627253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4627734Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4628302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4628758Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4629633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4630106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4630577Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplk2dl6on 2022-11-23T02:18:45.4631128Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplk2dl6on/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4631667Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8yszg_q0 2022-11-23T02:18:45.4632188Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8yszg_q0/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4632624Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4633040Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:45.4633525Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:18:45.4633934Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4634348Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:45.4634840Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:18:45.4635495Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:45.4636188Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:45.4636763Z [1669169913.928078] [08317a7e7676:47087:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:18:45.4637279Z [1669169913.947415] [08317a7e7676:47087:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:18:45.4637741Z [1669169913.947415] [08317a7e7676:47087:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:18:45.4638343Z [1669169913.928055] [08317a7e7676:47088:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:18:45.4638863Z [1669169913.947787] [08317a7e7676:47088:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:18:45.4639335Z [1669169913.947787] [08317a7e7676:47088:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:18:45.4639662Z ok (5.784s) 2022-11-23T02:18:45.4639812Z 2022-11-23T02:18:45.4640087Z ---------------------------------------------------------------------- 2022-11-23T02:18:45.4640416Z Ran 1 test in 5.784s 2022-11-23T02:18:45.4640657Z 2022-11-23T02:18:45.4640731Z OK 2022-11-23T02:18:45.4640867Z 2022-11-23T02:18:45.4640992Z Generating XML reports... 2022-11-23T02:18:45.4641643Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_ucc/TEST-TestDistributedNNFunctionsUcc-20221123021829.xml 2022-11-23T02:18:45.4642389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4642828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4643410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4643888Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4644340Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppwjujke9 2022-11-23T02:18:45.4644891Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppwjujke9/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4645333Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4645534Z 2022-11-23T02:18:45.4645642Z Running tests... 2022-11-23T02:18:45.4646036Z ---------------------------------------------------------------------- 2022-11-23T02:18:45.4646579Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_ucc 2022-11-23T02:18:45.4647158Z test_reduce (__main__.TestDistributedNNFunctionsUcc) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47206 2022-11-23T02:18:45.4647682Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47207 2022-11-23T02:18:45.4648303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4648764Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4649347Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4649812Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4650399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:18:45.4650856Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:18:45.4651433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:18:45.4651884Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:18:45.4652359Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa3vxqvmc 2022-11-23T02:18:45.4652912Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa3vxqvmc/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4653435Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpco_u2fe_ 2022-11-23T02:18:45.4653976Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpco_u2fe_/_remote_module_non_scriptable.py 2022-11-23T02:18:45.4654407Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4654821Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:18:45.4655381Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:18:45.4655800Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:18:45.4656208Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:18:45.4656690Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:18:45.4657364Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:45.4658067Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:18:45.4658702Z [1669169923.634267] [08317a7e7676:47207:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:18:45.4659196Z [1669169923.650471] [08317a7e7676:47207:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:18:45.4659668Z [1669169923.650471] [08317a7e7676:47207:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:18:45.4660182Z [1669169923.629642] [08317a7e7676:47206:0] ec_cuda.c:343 cuda ec WARN CUDA cooperative groups are not supported. Fall back to non cooperative launch. 2022-11-23T02:18:45.4660683Z [1669169923.648891] [08317a7e7676:47206:0] parser.c:1993 UCX WARN unused environment variables: UCX_COMMIT; UCX_HOME 2022-11-23T02:18:45.4661136Z [1669169923.648891] [08317a7e7676:47206:0] parser.c:1993 UCX WARN (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning) 2022-11-23T02:18:45.4661486Z ok (5.783s) 2022-11-23T02:18:45.4661636Z 2022-11-23T02:18:45.4661909Z ---------------------------------------------------------------------- 2022-11-23T02:18:45.4662240Z Ran 1 test in 5.783s 2022-11-23T02:18:45.4662384Z 2022-11-23T02:18:45.4662479Z OK 2022-11-23T02:18:45.4662618Z 2022-11-23T02:18:45.4662741Z Generating XML reports... 2022-11-23T02:18:45.4663383Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_ucc/TEST-TestDistributedNNFunctionsUcc-20221123021838.xml 2022-11-23T02:18:45.4663769Z 2022-11-23T02:18:45.4664238Z ##[endgroup] 2022-11-23T02:18:45.4664833Z FINISHED PRINTING LOG FILE of distributed/test_c10d_spawn_ucc (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_spawn_ucc_bt2_z199) 2022-11-23T02:18:45.4665183Z 2022-11-23T02:18:45.4665456Z Running distributed/_tensor/test_matrix_ops ... [2022-11-23 02:18:45.453102] 2022-11-23T02:18:45.4666145Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_tensor/test_matrix_ops.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:18:45.453460] 2022-11-23T02:19:22.6705935Z 2022-11-23T02:19:22.6706481Z Expand the folded group to see the log file of distributed/_tensor/test_matrix_ops 2022-11-23T02:19:22.6707649Z ##[group]PRINTING LOG FILE of distributed/_tensor/test_matrix_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_matrix_ops_9jbcgiuh) 2022-11-23T02:19:22.6708026Z 2022-11-23T02:19:22.6708123Z Running tests... 2022-11-23T02:19:22.6708640Z ---------------------------------------------------------------------- 2022-11-23T02:19:22.6709571Z Test results will be stored in test-reports/python-unittest/distributed._tensor.test_matrix_ops 2022-11-23T02:19:22.6712309Z test_addmm (__main__.DistMatrixOpsTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:19:22.6713334Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47325 2022-11-23T02:19:22.6714236Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47326 2022-11-23T02:19:22.6715195Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:19:22.6716185Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:19:22.6717161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:19:22.6717675Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:19:22.6718280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:19:22.6718741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:19:22.6719330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:19:22.6719811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:19:22.6720644Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:19:22.6721344Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:19:22.6721853Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:19:22.6722352Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:19:22.6723193Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:19:22.6724383Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:19:22.6724801Z ok (5.993s) 2022-11-23T02:19:22.6725246Z test_addmm_auto_redistribute (__main__.DistMatrixOpsTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47406 2022-11-23T02:19:22.6725778Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47407 2022-11-23T02:19:22.6726468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:19:22.6726936Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:19:22.6727521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:19:22.6727998Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:19:22.6728567Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:19:22.6729021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:19:22.6729604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:19:22.6730081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:19:22.6730509Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:19:22.6731008Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:19:22.6731505Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:19:22.6731979Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:19:22.6732649Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:19:22.6733352Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:19:22.6733754Z ok (4.512s) 2022-11-23T02:19:22.6734153Z test_baddbmm (__main__.DistMatrixOpsTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47491 2022-11-23T02:19:22.6734671Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47492 2022-11-23T02:19:22.6735287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:19:22.6735808Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:19:22.6736410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:19:22.6736885Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:19:22.6737468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:19:22.6737899Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:19:22.6738479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:19:22.6739013Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:19:22.6739462Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:19:22.6739949Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:19:22.6740445Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:19:22.6740939Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:19:22.6741588Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:19:22.6742290Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:19:22.6742689Z ok (6.517s) 2022-11-23T02:19:22.6743101Z test_bmm (__main__.DistMatrixOpsTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47576 2022-11-23T02:19:22.6743595Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47577 2022-11-23T02:19:22.6744215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:19:22.6744674Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:19:22.6745238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:19:22.6745719Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:19:22.6746303Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:19:22.6746748Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:19:22.6747310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:19:22.6747785Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:19:22.6748226Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:19:22.6748726Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:19:22.6749526Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:19:22.6750025Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:19:22.6750699Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:19:22.6751383Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:19:22.6751784Z ok (4.714s) 2022-11-23T02:19:22.6752201Z test_mm (__main__.DistMatrixOpsTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47661 2022-11-23T02:19:22.6752709Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47662 2022-11-23T02:19:22.6753411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:19:22.6753882Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:19:22.6754470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:19:22.6754931Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:19:22.6755517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:19:22.6755965Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:19:22.6756544Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:19:22.6757081Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:19:22.6757528Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:19:22.6758033Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:19:22.6758531Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:19:22.6758999Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:19:22.6759672Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:19:22.6760370Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:19:22.6760750Z ok (4.713s) 2022-11-23T02:19:22.6761158Z test_t (__main__.DistMatrixOpsTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47746 2022-11-23T02:19:22.6761667Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47747 2022-11-23T02:19:22.6762286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:19:22.6762726Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:19:22.6763310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:19:22.6763784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:19:22.6764370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:19:22.6764804Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:19:22.6765386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:19:22.6765857Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:19:22.6766287Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:19:22.6766786Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:19:22.6767281Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:19:22.6767768Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:19:22.6768415Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:19:22.6769113Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:19:22.6769519Z ok (4.011s) 2022-11-23T02:19:22.6769919Z test_t_partial (__main__.DistMatrixOpsTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47827 2022-11-23T02:19:22.6770438Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 47828 2022-11-23T02:19:22.6771124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:19:22.6771590Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:19:22.6772156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:19:22.6772636Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:19:22.6773224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:19:22.6773673Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:19:22.6774304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:19:22.6774779Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:19:22.6775229Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:19:22.6775711Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:19:22.6776206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:19:22.6776697Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:19:22.6777359Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:19:22.6778042Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:19:22.6778444Z ok (4.412s) 2022-11-23T02:19:22.6778597Z 2022-11-23T02:19:22.6778872Z ---------------------------------------------------------------------- 2022-11-23T02:19:22.6779188Z Ran 7 tests in 34.873s 2022-11-23T02:19:22.6779356Z 2022-11-23T02:19:22.6779449Z OK 2022-11-23T02:19:22.6779586Z 2022-11-23T02:19:22.6779711Z Generating XML reports... 2022-11-23T02:19:22.6780310Z Generated XML report: test-reports/python-unittest/distributed._tensor.test_matrix_ops/TEST-DistMatrixOpsTest-20221123021847.xml 2022-11-23T02:19:22.6780665Z 2022-11-23T02:19:22.6781060Z ##[endgroup] 2022-11-23T02:19:22.6781671Z FINISHED PRINTING LOG FILE of distributed/_tensor/test_matrix_ops (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_matrix_ops_9jbcgiuh) 2022-11-23T02:19:22.6782032Z 2022-11-23T02:19:22.6782324Z Running distributed/fsdp/test_fsdp_flatten_params ... [2022-11-23 02:19:22.670697] 2022-11-23T02:19:22.6783090Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_flatten_params.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:19:22.671016] 2022-11-23T02:20:01.8662187Z 2022-11-23T02:20:01.8662999Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_flatten_params 2022-11-23T02:20:01.8664078Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_flatten_params (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_flatten_params_4cgnqapk) 2022-11-23T02:20:01.8664474Z 2022-11-23T02:20:01.8664589Z Running tests... 2022-11-23T02:20:01.8665159Z ---------------------------------------------------------------------- 2022-11-23T02:20:01.8665747Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_flatten_params 2022-11-23T02:20:01.8669812Z test_empty_module (__main__.TestFlattenParams) 2022-11-23T02:20:01.8670310Z Tests flattening an empty module (i.e. one without any parameters). ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:20:01.8670847Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47943 2022-11-23T02:20:01.8671546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:01.8672217Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:01.8672843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:01.8673327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:01.8673802Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:01.8674460Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:20:01.8675015Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:01.8675495Z dist init r=0, world=1 2022-11-23T02:20:01.8675743Z ok (5.354s) 2022-11-23T02:20:01.8676044Z test_flat_param_shard_metadata (__main__.TestFlattenParams) 2022-11-23T02:20:01.8676590Z Tests that ``FlatParameter`` shard metadata are computed as expected. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 47983 2022-11-23T02:20:01.8677318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:01.8677764Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:01.8678354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:01.8678839Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:01.8679313Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:01.8679967Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:20:01.8680512Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:01.8680878Z dist init r=0, world=1 2022-11-23T02:20:01.8681121Z ok (3.712s) 2022-11-23T02:20:01.8681417Z test_flatten_nothing (__main__.TestFlattenParams) 2022-11-23T02:20:01.8681935Z Tests that constructing a ``FlatParamHandle`` with no parameters ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48023 2022-11-23T02:20:01.8682693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:01.8683136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:01.8683717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:01.8684195Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:01.8684668Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:01.8685316Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:20:01.8685857Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:01.8686218Z dist init r=0, world=1 2022-11-23T02:20:01.8686444Z ok (3.809s) 2022-11-23T02:20:01.8686755Z test_numel_with_shared_params (__main__.TestFlattenParams) 2022-11-23T02:20:01.8687290Z Tests that numel is preserved after flattening when there are shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48063 2022-11-23T02:20:01.8688018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:01.8688463Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:01.8689055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:01.8689538Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:01.8690063Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:01.8690729Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:20:01.8691261Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:01.8691625Z dist init r=0, world=1 2022-11-23T02:20:01.8691851Z ok (3.709s) 2022-11-23T02:20:01.8692208Z test_numel_without_shared_params (__main__.TestFlattenParams) 2022-11-23T02:20:01.8692742Z Tests that numel is preserved after flattening when there are no shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48103 2022-11-23T02:20:01.8693521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:01.8693980Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:01.8694549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:01.8695025Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:01.8695486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:01.8696151Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:20:01.8696666Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:01.8697029Z dist init r=0, world=1 2022-11-23T02:20:01.8697275Z ok (3.709s) 2022-11-23T02:20:01.8697577Z test_output_with_shared_params (__main__.TestFlattenParams) 2022-11-23T02:20:01.8698113Z Tests a forward pass after flattening when there are shared parameters ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48143 2022-11-23T02:20:01.8698826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:01.8699288Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:01.8699857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:01.8700332Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:01.8700794Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:01.8701446Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:20:01.8701986Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:01.8702348Z dist init r=0, world=1 2022-11-23T02:20:01.8702592Z ok (4.210s) 2022-11-23T02:20:01.8702893Z test_output_without_shared_params (__main__.TestFlattenParams) 2022-11-23T02:20:01.8703426Z Tests a forward pass after flattening when there are no shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48183 2022-11-23T02:20:01.8704124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:01.8704562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:01.8705146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:01.8705620Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:01.8706080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:01.8706735Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:20:01.8707326Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:01.8707693Z dist init r=0, world=1 2022-11-23T02:20:01.8707920Z ok (4.212s) 2022-11-23T02:20:01.8708224Z test_partial_flattening (__main__.TestFlattenParams) 2022-11-23T02:20:01.8708716Z Tests flattening some submodules but not others. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48223 2022-11-23T02:20:01.8709796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:01.8710233Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:01.8710848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:01.8711424Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:01.8711867Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:01.8712549Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:20:01.8713082Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:01.8714367Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:20:01.8715240Z warnings.warn( 2022-11-23T02:20:01.8715476Z dist init r=0, world=1 2022-11-23T02:20:01.8715718Z ok (3.809s) 2022-11-23T02:20:01.8716049Z test_pnorm_after_step_with_shared_params (__main__.TestFlattenParams) 2022-11-23T02:20:01.8716599Z Tests for parameter Frobenius norm parity after an optimizer step when ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48263 2022-11-23T02:20:01.8717302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:20:01.8717759Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:20:01.8718341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:20:01.8718804Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:20:01.8719263Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:20:01.8719932Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:20:01.8720466Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:20:01.8720815Z dist init r=0, world=1 2022-11-23T02:20:01.8721062Z ok (4.310s) 2022-11-23T02:20:01.8721215Z 2022-11-23T02:20:01.8721491Z ---------------------------------------------------------------------- 2022-11-23T02:20:01.8721804Z Ran 9 tests in 36.833s 2022-11-23T02:20:01.8721969Z 2022-11-23T02:20:01.8722062Z OK 2022-11-23T02:20:01.8722197Z 2022-11-23T02:20:01.8722322Z Generating XML reports... 2022-11-23T02:20:01.8722945Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_flatten_params/TEST-TestFlattenParams-20221123021924.xml 2022-11-23T02:20:01.8723300Z 2022-11-23T02:20:01.8723701Z ##[endgroup] 2022-11-23T02:20:01.8724339Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_flatten_params (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_flatten_params_4cgnqapk) 2022-11-23T02:20:01.8724730Z 2022-11-23T02:20:01.8724998Z Running distributed/test_c10d_spawn_gloo ... [2022-11-23 02:20:01.866273] 2022-11-23T02:20:01.8725791Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_spawn_gloo.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:20:01.866551] 2022-11-23T02:21:33.0291888Z 2022-11-23T02:21:33.0292544Z Expand the folded group to see the log file of distributed/test_c10d_spawn_gloo 2022-11-23T02:21:33.0293467Z ##[group]PRINTING LOG FILE of distributed/test_c10d_spawn_gloo (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_spawn_gloo_11n4unzf) 2022-11-23T02:21:33.0294068Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4bkw4afd 2022-11-23T02:21:33.0294626Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4bkw4afd/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0295414Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0296160Z , <__main__.DistributedDataParallelSingleProcessTest testMethod=test_cuda>, <__main__.DistributedDataParallelSingleProcessTest testMethod=test_rnn>]> 2022-11-23T02:21:33.0296965Z test_cpu (__main__.DistributedDataParallelSingleProcessTest) 2022-11-23T02:21:33.0297431Z test_cuda (__main__.DistributedDataParallelSingleProcessTest) 2022-11-23T02:21:33.0297890Z test_rnn (__main__.DistributedDataParallelSingleProcessTest) 2022-11-23T02:21:33.0298262Z 2022-11-23T02:21:33.0298632Z 2022-11-23T02:21:33.0299824Z , <__main__.TestDistributedNNFunctionsGloo testMethod=test_all_to_all>, <__main__.TestDistributedNNFunctionsGloo testMethod=test_all_to_all_single>, <__main__.TestDistributedNNFunctionsGloo testMethod=test_allreduce>, <__main__.TestDistributedNNFunctionsGloo testMethod=test_broadcast>, <__main__.TestDistributedNNFunctionsGloo testMethod=test_gather>, <__main__.TestDistributedNNFunctionsGloo testMethod=test_reduce>, <__main__.TestDistributedNNFunctionsGloo testMethod=test_scatter>]> 2022-11-23T02:21:33.0301006Z test_all_gather (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:21:33.0301409Z test_all_to_all (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:21:33.0301816Z test_all_to_all_single (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:21:33.0302213Z test_allreduce (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:21:33.0302619Z test_broadcast (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:21:33.0303024Z test_gather (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:21:33.0303396Z test_reduce (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:21:33.0303790Z test_scatter (__main__.TestDistributedNNFunctionsGloo) 2022-11-23T02:21:33.0304507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0304954Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0305557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0306039Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0306518Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy3gi8wdw 2022-11-23T02:21:33.0312189Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy3gi8wdw/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0312656Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0312877Z 2022-11-23T02:21:33.0312993Z Running tests... 2022-11-23T02:21:33.0313463Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0314023Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:21:33.0314670Z test_cpu (__main__.DistributedDataParallelSingleProcessTest) ... INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:21:33.0315343Z ok (0.023s) 2022-11-23T02:21:33.0315520Z 2022-11-23T02:21:33.0315805Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0316333Z Ran 1 test in 0.023s 2022-11-23T02:21:33.0316642Z 2022-11-23T02:21:33.0316807Z OK 2022-11-23T02:21:33.0316934Z 2022-11-23T02:21:33.0317062Z Generating XML reports... 2022-11-23T02:21:33.0317785Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-DistributedDataParallelSingleProcessTest-20221123022008.xml 2022-11-23T02:21:33.0318594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0319166Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0319766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0320254Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0320735Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxjc_c1ch 2022-11-23T02:21:33.0321264Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxjc_c1ch/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0321700Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0321901Z 2022-11-23T02:21:33.0322008Z Running tests... 2022-11-23T02:21:33.0322399Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0322953Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:21:33.0323584Z test_cuda (__main__.DistributedDataParallelSingleProcessTest) ... INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:21:33.0324064Z ok (0.457s) 2022-11-23T02:21:33.0324211Z 2022-11-23T02:21:33.0324458Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0324788Z Ran 1 test in 0.457s 2022-11-23T02:21:33.0324954Z 2022-11-23T02:21:33.0325049Z OK 2022-11-23T02:21:33.0325184Z 2022-11-23T02:21:33.0325290Z Generating XML reports... 2022-11-23T02:21:33.0325985Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-DistributedDataParallelSingleProcessTest-20221123022012.xml 2022-11-23T02:21:33.0326772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0327231Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0327798Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0328280Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0328750Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4jh5p9hh 2022-11-23T02:21:33.0329305Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4jh5p9hh/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0329721Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0329919Z 2022-11-23T02:21:33.0330027Z Running tests... 2022-11-23T02:21:33.0330434Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0330959Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:21:33.0331589Z test_rnn (__main__.DistributedDataParallelSingleProcessTest) ... INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:21:33.0332061Z ok (1.292s) 2022-11-23T02:21:33.0332209Z 2022-11-23T02:21:33.0332474Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0332787Z Ran 1 test in 1.292s 2022-11-23T02:21:33.0332950Z 2022-11-23T02:21:33.0333042Z OK 2022-11-23T02:21:33.0333175Z 2022-11-23T02:21:33.0333300Z Generating XML reports... 2022-11-23T02:21:33.0334032Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-DistributedDataParallelSingleProcessTest-20221123022017.xml 2022-11-23T02:21:33.0334837Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0335297Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0335881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0336347Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0336819Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqi41yu1x 2022-11-23T02:21:33.0337424Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqi41yu1x/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0337838Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0338039Z 2022-11-23T02:21:33.0338148Z Running tests... 2022-11-23T02:21:33.0338556Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0339105Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:21:33.0339669Z test_all_gather (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48491 2022-11-23T02:21:33.0340223Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48492 2022-11-23T02:21:33.0340841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0341282Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0341866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0342346Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0342939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0343377Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0343960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0344431Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0344899Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyhf2cq_a 2022-11-23T02:21:33.0345439Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyhf2cq_a/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0345984Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg3w_f9j9 2022-11-23T02:21:33.0346525Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg3w_f9j9/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0346936Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0347353Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:21:33.0347751Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0348155Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:21:33.0348633Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:21:33.0349502Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:21:33.0350192Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0350881Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0351284Z ok (5.277s) 2022-11-23T02:21:33.0351432Z 2022-11-23T02:21:33.0351700Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0352118Z Ran 1 test in 5.277s 2022-11-23T02:21:33.0352273Z 2022-11-23T02:21:33.0352369Z OK 2022-11-23T02:21:33.0352504Z 2022-11-23T02:21:33.0352629Z Generating XML reports... 2022-11-23T02:21:33.0353286Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022022.xml 2022-11-23T02:21:33.0354014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0354471Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0355055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0355612Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0356069Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplxd9l8_u 2022-11-23T02:21:33.0356625Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplxd9l8_u/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0357065Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0357268Z 2022-11-23T02:21:33.0357358Z Running tests... 2022-11-23T02:21:33.0357766Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0358312Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:21:33.0358894Z test_all_to_all (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48607 2022-11-23T02:21:33.0359426Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48608 2022-11-23T02:21:33.0360045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0360504Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0361091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0361553Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0362143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0362595Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0363156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0363633Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0364103Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpng2ckc1d 2022-11-23T02:21:33.0364659Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpng2ckc1d/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0365183Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp82iiueb8 2022-11-23T02:21:33.0365727Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp82iiueb8/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0366159Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0366555Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:21:33.0366947Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0367357Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:21:33.0367853Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:21:33.0368340Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:21:33.0369024Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0369852Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0370246Z ok (5.383s) 2022-11-23T02:21:33.0370395Z 2022-11-23T02:21:33.0370669Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0370996Z Ran 1 test in 5.383s 2022-11-23T02:21:33.0371160Z 2022-11-23T02:21:33.0371252Z OK 2022-11-23T02:21:33.0371369Z 2022-11-23T02:21:33.0371493Z Generating XML reports... 2022-11-23T02:21:33.0372140Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022031.xml 2022-11-23T02:21:33.0372888Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0373387Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0373971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0374453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0374929Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjhdwau8y 2022-11-23T02:21:33.0375461Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjhdwau8y/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0375900Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0376100Z 2022-11-23T02:21:33.0376310Z Running tests... 2022-11-23T02:21:33.0376718Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0377262Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:21:33.0377839Z test_all_to_all_single (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48723 2022-11-23T02:21:33.0378401Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48724 2022-11-23T02:21:33.0379023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0379482Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0380046Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0380525Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0381160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0381610Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0382169Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0382641Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0383120Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgzx25d_j 2022-11-23T02:21:33.0383652Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgzx25d_j/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0384193Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpki5pba5u 2022-11-23T02:21:33.0384741Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpki5pba5u/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0385178Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0385574Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:21:33.0385968Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0386378Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:21:33.0386864Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:21:33.0387369Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:21:33.0388096Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0388808Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0389644Z ok (5.280s) 2022-11-23T02:21:33.0389797Z 2022-11-23T02:21:33.0390075Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0390406Z Ran 1 test in 5.281s 2022-11-23T02:21:33.0390570Z 2022-11-23T02:21:33.0390645Z OK 2022-11-23T02:21:33.0390784Z 2022-11-23T02:21:33.0390907Z Generating XML reports... 2022-11-23T02:21:33.0391652Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022040.xml 2022-11-23T02:21:33.0392406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0392850Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0393438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0393913Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0394387Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7u2_81qe 2022-11-23T02:21:33.0394914Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7u2_81qe/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0395348Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0395547Z 2022-11-23T02:21:33.0395656Z Running tests... 2022-11-23T02:21:33.0396050Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0396594Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:21:33.0397187Z test_allreduce (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48839 2022-11-23T02:21:33.0397739Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48840 2022-11-23T02:21:33.0398345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0398799Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0399384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0399846Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0400430Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0400883Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0401467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0401923Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0402397Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzvnchiz_ 2022-11-23T02:21:33.0402947Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzvnchiz_/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0403467Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbu8ryuxm 2022-11-23T02:21:33.0404011Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbu8ryuxm/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0404444Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0404772Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0405163Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:21:33.0405644Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:21:33.0406211Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:21:33.0406709Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:21:33.0407383Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0408083Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0408484Z ok (5.280s) 2022-11-23T02:21:33.0408633Z 2022-11-23T02:21:33.0408885Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0409266Z Ran 1 test in 5.281s 2022-11-23T02:21:33.0409430Z 2022-11-23T02:21:33.0409523Z OK 2022-11-23T02:21:33.0409657Z 2022-11-23T02:21:33.0409762Z Generating XML reports... 2022-11-23T02:21:33.0410417Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022050.xml 2022-11-23T02:21:33.0411168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0411628Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0412193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0412670Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0413139Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn_ey2hdy 2022-11-23T02:21:33.0413687Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn_ey2hdy/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0414104Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0414302Z 2022-11-23T02:21:33.0414411Z Running tests... 2022-11-23T02:21:33.0414819Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0415348Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:21:33.0415932Z test_broadcast (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 48955 2022-11-23T02:21:33.0416480Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 48956 2022-11-23T02:21:33.0417098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0417541Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0418122Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0418603Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0419176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0419628Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0420204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0420677Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0421131Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp5wwwln7 2022-11-23T02:21:33.0421683Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp5wwwln7/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0422221Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc2v2kgqd 2022-11-23T02:21:33.0422770Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc2v2kgqd/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0423183Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0423664Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:21:33.0424071Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0424463Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:21:33.0424956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:21:33.0425455Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:21:33.0426128Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0426809Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0427272Z ok (5.278s) 2022-11-23T02:21:33.0427421Z 2022-11-23T02:21:33.0427691Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0428002Z Ran 1 test in 5.278s 2022-11-23T02:21:33.0428168Z 2022-11-23T02:21:33.0428261Z OK 2022-11-23T02:21:33.0428395Z 2022-11-23T02:21:33.0428519Z Generating XML reports... 2022-11-23T02:21:33.0429492Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022059.xml 2022-11-23T02:21:33.0430240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0430698Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0431280Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0431745Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0432220Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_welmka1 2022-11-23T02:21:33.0432768Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_welmka1/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0433204Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0433388Z 2022-11-23T02:21:33.0433496Z Running tests... 2022-11-23T02:21:33.0433902Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0434450Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:21:33.0435011Z test_gather (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49071 2022-11-23T02:21:33.0435556Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49072 2022-11-23T02:21:33.0436176Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0436635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0437204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0437677Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0438261Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0438714Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0439274Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0439743Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0440213Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp474wxnqg 2022-11-23T02:21:33.0440744Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp474wxnqg/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0441290Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk8yph4_o 2022-11-23T02:21:33.0441912Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk8yph4_o/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0442359Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0442753Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:21:33.0443149Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0443562Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:21:33.0444044Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:21:33.0444549Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:21:33.0445291Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0445996Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0446383Z ok (5.382s) 2022-11-23T02:21:33.0446532Z 2022-11-23T02:21:33.0446800Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0447131Z Ran 1 test in 5.382s 2022-11-23T02:21:33.0447294Z 2022-11-23T02:21:33.0447368Z OK 2022-11-23T02:21:33.0447502Z 2022-11-23T02:21:33.0447625Z Generating XML reports... 2022-11-23T02:21:33.0448272Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022108.xml 2022-11-23T02:21:33.0449021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0449467Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0450049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0450523Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0450982Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq8baoykj 2022-11-23T02:21:33.0451531Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq8baoykj/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0451966Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0452164Z 2022-11-23T02:21:33.0452272Z Running tests... 2022-11-23T02:21:33.0452663Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0453207Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:21:33.0453790Z test_reduce (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49187 2022-11-23T02:21:33.0454323Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49188 2022-11-23T02:21:33.0454944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0455401Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0455986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0456447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0457031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0457487Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0458067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0458526Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0458998Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplqoaczva 2022-11-23T02:21:33.0459601Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplqoaczva/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0460131Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbj7atyul 2022-11-23T02:21:33.0460674Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbj7atyul/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0461106Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0461521Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:21:33.0461900Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0462311Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:21:33.0462862Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:21:33.0463358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:21:33.0464038Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0464744Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0465146Z ok (5.280s) 2022-11-23T02:21:33.0465276Z 2022-11-23T02:21:33.0465546Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0465877Z Ran 1 test in 5.281s 2022-11-23T02:21:33.0466037Z 2022-11-23T02:21:33.0466130Z OK 2022-11-23T02:21:33.0466264Z 2022-11-23T02:21:33.0466369Z Generating XML reports... 2022-11-23T02:21:33.0467014Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022117.xml 2022-11-23T02:21:33.0467767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0468229Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0468801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0469672Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0470150Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjv24x03w 2022-11-23T02:21:33.0470683Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjv24x03w/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0471117Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0471315Z 2022-11-23T02:21:33.0471423Z Running tests... 2022-11-23T02:21:33.0471833Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0472366Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_gloo 2022-11-23T02:21:33.0472946Z test_scatter (__main__.TestDistributedNNFunctionsGloo) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49303 2022-11-23T02:21:33.0473508Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49304 2022-11-23T02:21:33.0474121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0474561Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0475146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0475624Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0476193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:21:33.0476652Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:21:33.0477568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:21:33.0478161Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:21:33.0478631Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcay3gohu 2022-11-23T02:21:33.0479179Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcay3gohu/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0479719Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwzi609e7 2022-11-23T02:21:33.0480243Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwzi609e7/_remote_module_non_scriptable.py 2022-11-23T02:21:33.0480674Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0481134Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:21:33.0481612Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:21:33.0482004Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:21:33.0482508Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:21:33.0483012Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:21:33.0483675Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0484379Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:21:33.0484782Z ok (5.378s) 2022-11-23T02:21:33.0484933Z 2022-11-23T02:21:33.0485199Z ---------------------------------------------------------------------- 2022-11-23T02:21:33.0485511Z Ran 1 test in 5.379s 2022-11-23T02:21:33.0485678Z 2022-11-23T02:21:33.0485771Z OK 2022-11-23T02:21:33.0485905Z 2022-11-23T02:21:33.0486029Z Generating XML reports... 2022-11-23T02:21:33.0486659Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022126.xml 2022-11-23T02:21:33.0487059Z 2022-11-23T02:21:33.0487545Z ##[endgroup] 2022-11-23T02:21:33.0488143Z FINISHED PRINTING LOG FILE of distributed/test_c10d_spawn_gloo (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_spawn_gloo_11n4unzf) 2022-11-23T02:21:33.0488491Z 2022-11-23T02:21:33.0488757Z Running distributed/test_c10d_spawn_nccl ... [2022-11-23 02:21:33.029435] 2022-11-23T02:21:33.0489454Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_spawn_nccl.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:21:33.029759] 2022-11-23T02:23:00.6763815Z 2022-11-23T02:23:00.6764360Z Expand the folded group to see the log file of distributed/test_c10d_spawn_nccl 2022-11-23T02:23:00.6766760Z ##[group]PRINTING LOG FILE of distributed/test_c10d_spawn_nccl (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_spawn_nccl_c6uqgh3v) 2022-11-23T02:23:00.6767607Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyeni0cbb 2022-11-23T02:23:00.6768191Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyeni0cbb/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6768648Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6768963Z 2022-11-23T02:23:00.6769287Z 2022-11-23T02:23:00.6770859Z , <__main__.TestDistributedNNFunctionsNccl testMethod=test_all_gather_base>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_all_to_all>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_all_to_all_single>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_allreduce>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_broadcast>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_reduce>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_reduce_scatter>, <__main__.TestDistributedNNFunctionsNccl testMethod=test_reduce_scatter_non_contiguous>]> 2022-11-23T02:23:00.6772350Z test_all_gather (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:23:00.6773165Z test_all_gather_base (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:23:00.6773793Z test_all_to_all (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:23:00.6774499Z test_all_to_all_single (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:23:00.6775419Z test_allreduce (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:23:00.6775985Z test_broadcast (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:23:00.6776365Z test_reduce (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:23:00.6776772Z test_reduce_scatter (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:23:00.6780816Z test_reduce_scatter_non_contiguous (__main__.TestDistributedNNFunctionsNccl) 2022-11-23T02:23:00.6781597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6782114Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6782722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6783213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6783674Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_50bw19o 2022-11-23T02:23:00.6784229Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_50bw19o/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6784665Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6784870Z 2022-11-23T02:23:00.6784964Z Running tests... 2022-11-23T02:23:00.6785393Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6785947Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:23:00.6786544Z test_all_gather (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49453 2022-11-23T02:23:00.6787088Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49454 2022-11-23T02:23:00.6787714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6788178Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6788768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6789651Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6790263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6790724Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6791300Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6791785Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6792269Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp12dppe2 2022-11-23T02:23:00.6792836Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp12dppe2/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6793361Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7k4a_m8w 2022-11-23T02:23:00.6793909Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7k4a_m8w/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6794339Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6794657Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6795072Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:00.6795559Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:00.6796225Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:23:00.6796743Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:23:00.6797423Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6798130Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6798536Z ok (5.281s) 2022-11-23T02:23:00.6798670Z 2022-11-23T02:23:00.6798943Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6799367Z Ran 1 test in 5.281s 2022-11-23T02:23:00.6799537Z 2022-11-23T02:23:00.6799778Z OK 2022-11-23T02:23:00.6799918Z 2022-11-23T02:23:00.6800027Z Generating XML reports... 2022-11-23T02:23:00.6800695Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022140.xml 2022-11-23T02:23:00.6801453Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6801918Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6802488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6802967Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6803442Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps1omqc0w 2022-11-23T02:23:00.6803985Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps1omqc0w/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6804421Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6804620Z 2022-11-23T02:23:00.6804731Z Running tests... 2022-11-23T02:23:00.6805146Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6805678Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:23:00.6806276Z test_all_gather_base (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49572 2022-11-23T02:23:00.6806839Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49573 2022-11-23T02:23:00.6807462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6807900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6808484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6808965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6809536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6809995Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6810572Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6811045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6811498Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkxumoy_3 2022-11-23T02:23:00.6812053Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkxumoy_3/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6812596Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphnenpqqg 2022-11-23T02:23:00.6813125Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphnenpqqg/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6813564Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6814036Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:00.6814543Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:23:00.6814938Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6815347Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:00.6815842Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:23:00.6816498Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6817205Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6818212Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:23:00.6818845Z warnings.warn( 2022-11-23T02:23:00.6819612Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:23:00.6820143Z warnings.warn( 2022-11-23T02:23:00.6820919Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2923: UserWarning: torch.distributed._reduce_scatter_base is a private function and will be deprecated. Please use torch.distributed.reduce_scatter_tensor instead. 2022-11-23T02:23:00.6821484Z warnings.warn( 2022-11-23T02:23:00.6822252Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2923: UserWarning: torch.distributed._reduce_scatter_base is a private function and will be deprecated. Please use torch.distributed.reduce_scatter_tensor instead. 2022-11-23T02:23:00.6822780Z warnings.warn( 2022-11-23T02:23:00.6823023Z ok (5.481s) 2022-11-23T02:23:00.6823176Z 2022-11-23T02:23:00.6823446Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6823780Z Ran 1 test in 5.482s 2022-11-23T02:23:00.6823925Z 2022-11-23T02:23:00.6824019Z OK 2022-11-23T02:23:00.6824155Z 2022-11-23T02:23:00.6824280Z Generating XML reports... 2022-11-23T02:23:00.6824934Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022149.xml 2022-11-23T02:23:00.6825668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6826138Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6826725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6827210Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6827668Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn_uco_7w 2022-11-23T02:23:00.6828216Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn_uco_7w/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6828651Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6828855Z 2022-11-23T02:23:00.6829313Z Running tests... 2022-11-23T02:23:00.6829748Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6830297Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:23:00.6830887Z test_all_to_all (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49691 2022-11-23T02:23:00.6831430Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49692 2022-11-23T02:23:00.6832143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6832621Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6833194Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6833680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6834273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6834726Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6835292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6835845Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6836320Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpok2uivfk 2022-11-23T02:23:00.6836878Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpok2uivfk/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6837403Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn9_94km9 2022-11-23T02:23:00.6837944Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn9_94km9/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6838381Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6838779Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:00.6839281Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:23:00.6839694Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6840095Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:00.6840590Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:23:00.6841265Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6841968Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6842352Z ok (5.481s) 2022-11-23T02:23:00.6842504Z 2022-11-23T02:23:00.6842778Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6843109Z Ran 1 test in 5.481s 2022-11-23T02:23:00.6843275Z 2022-11-23T02:23:00.6843367Z OK 2022-11-23T02:23:00.6843485Z 2022-11-23T02:23:00.6859902Z Generating XML reports... 2022-11-23T02:23:00.6860681Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022158.xml 2022-11-23T02:23:00.6861446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6861917Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6862521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6863005Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6863467Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpexw9vks2 2022-11-23T02:23:00.6864020Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpexw9vks2/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6864461Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6864663Z 2022-11-23T02:23:00.6864755Z Running tests... 2022-11-23T02:23:00.6865171Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6865722Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:23:00.6866427Z test_all_to_all_single (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49812 2022-11-23T02:23:00.6866984Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49813 2022-11-23T02:23:00.6867609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6868069Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6868654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6869517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6870110Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6870677Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6871245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6871727Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6872204Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdlkoncr1 2022-11-23T02:23:00.6872758Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdlkoncr1/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6873287Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwxxqkrjy 2022-11-23T02:23:00.6873833Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwxxqkrjy/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6874269Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6874581Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6874997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:00.6875501Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:23:00.6876002Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:00.6876479Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:23:00.6877156Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6877858Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6878239Z ok (5.483s) 2022-11-23T02:23:00.6878391Z 2022-11-23T02:23:00.6878660Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6878994Z Ran 1 test in 5.484s 2022-11-23T02:23:00.6879160Z 2022-11-23T02:23:00.6879254Z OK 2022-11-23T02:23:00.6879372Z 2022-11-23T02:23:00.6879498Z Generating XML reports... 2022-11-23T02:23:00.6880196Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022208.xml 2022-11-23T02:23:00.6880952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6881415Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6881981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6882459Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6882934Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmo7lxwmv 2022-11-23T02:23:00.6883470Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmo7lxwmv/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6883908Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6884110Z 2022-11-23T02:23:00.6884219Z Running tests... 2022-11-23T02:23:00.6884614Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6885238Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:23:00.6885832Z test_allreduce (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 49933 2022-11-23T02:23:00.6886389Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 49934 2022-11-23T02:23:00.6886996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6887449Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6888034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6888575Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6889148Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6889601Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6890180Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6890637Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6891105Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy0b_dm0u 2022-11-23T02:23:00.6891651Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy0b_dm0u/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6892190Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqyuyke9e 2022-11-23T02:23:00.6892723Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqyuyke9e/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6893155Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6893564Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:00.6894045Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:23:00.6894457Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6894867Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:00.6895355Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:23:00.6896013Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6896709Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6897114Z ok (5.483s) 2022-11-23T02:23:00.6897265Z 2022-11-23T02:23:00.6897528Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6897837Z Ran 1 test in 5.483s 2022-11-23T02:23:00.6898001Z 2022-11-23T02:23:00.6898095Z OK 2022-11-23T02:23:00.6898235Z 2022-11-23T02:23:00.6898360Z Generating XML reports... 2022-11-23T02:23:00.6898998Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022217.xml 2022-11-23T02:23:00.6899753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6900214Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6900792Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6901253Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6901727Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpha6nfi5b 2022-11-23T02:23:00.6902281Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpha6nfi5b/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6902841Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6903050Z 2022-11-23T02:23:00.6903161Z Running tests... 2022-11-23T02:23:00.6903576Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6904117Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:23:00.6904683Z test_broadcast (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50052 2022-11-23T02:23:00.6905229Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50053 2022-11-23T02:23:00.6905847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6906371Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6906952Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6907434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6908019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6908447Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6909439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6909920Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6910370Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8spg71kp 2022-11-23T02:23:00.6910921Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8spg71kp/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6911469Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps50m_kga 2022-11-23T02:23:00.6912019Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps50m_kga/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6912432Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6912847Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:00.6913346Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:23:00.6913762Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6914151Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:00.6914647Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:23:00.6915329Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6916016Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6916417Z ok (5.481s) 2022-11-23T02:23:00.6916572Z 2022-11-23T02:23:00.6916842Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6917173Z Ran 1 test in 5.481s 2022-11-23T02:23:00.6917319Z 2022-11-23T02:23:00.6917411Z OK 2022-11-23T02:23:00.6917547Z 2022-11-23T02:23:00.6917673Z Generating XML reports... 2022-11-23T02:23:00.6918322Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022226.xml 2022-11-23T02:23:00.6919060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6919517Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6920107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6920589Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6921132Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpna0adqtr 2022-11-23T02:23:00.6921696Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpna0adqtr/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6922136Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6922337Z 2022-11-23T02:23:00.6922429Z Running tests... 2022-11-23T02:23:00.6922840Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6923381Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:23:00.6923960Z test_reduce (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50171 2022-11-23T02:23:00.6924572Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50172 2022-11-23T02:23:00.6925193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6925657Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6926225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6926701Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6927291Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6927746Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6928308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6928787Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6929263Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr9cwbkoc 2022-11-23T02:23:00.6929819Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr9cwbkoc/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6930341Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_jkdmarg 2022-11-23T02:23:00.6930881Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_jkdmarg/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6931316Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6931710Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:00.6932208Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:23:00.6932619Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6933030Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:00.6933510Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:23:00.6934188Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6934888Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6935271Z ok (5.277s) 2022-11-23T02:23:00.6935424Z 2022-11-23T02:23:00.6935693Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6936030Z Ran 1 test in 5.277s 2022-11-23T02:23:00.6936193Z 2022-11-23T02:23:00.6936288Z OK 2022-11-23T02:23:00.6936406Z 2022-11-23T02:23:00.6936531Z Generating XML reports... 2022-11-23T02:23:00.6937175Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022236.xml 2022-11-23T02:23:00.6937928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6938369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6939015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6939503Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6939976Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyd9yv72e 2022-11-23T02:23:00.6940508Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyd9yv72e/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6940944Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6941147Z 2022-11-23T02:23:00.6941257Z Running tests... 2022-11-23T02:23:00.6941647Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6942247Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:23:00.6942840Z test_reduce_scatter (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50290 2022-11-23T02:23:00.6943404Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50291 2022-11-23T02:23:00.6944006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6944459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6945040Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6945521Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6946095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6946554Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6947138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6947602Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6948070Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5mo0rbrb 2022-11-23T02:23:00.6948619Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5mo0rbrb/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6949502Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaycrtc92 2022-11-23T02:23:00.6950033Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaycrtc92/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6950470Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6950885Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:00.6951373Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:23:00.6951783Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6952195Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:00.6952692Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:23:00.6953358Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6954054Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6954456Z ok (5.381s) 2022-11-23T02:23:00.6954606Z 2022-11-23T02:23:00.6954858Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6955185Z Ran 1 test in 5.382s 2022-11-23T02:23:00.6955352Z 2022-11-23T02:23:00.6955449Z OK 2022-11-23T02:23:00.6955585Z 2022-11-23T02:23:00.6955710Z Generating XML reports... 2022-11-23T02:23:00.6956343Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022245.xml 2022-11-23T02:23:00.6957171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6957641Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6958213Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6958691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6959164Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4a3lr7jl 2022-11-23T02:23:00.6959709Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4a3lr7jl/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6960204Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6960403Z 2022-11-23T02:23:00.6960514Z Running tests... 2022-11-23T02:23:00.6960925Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6961460Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_spawn_nccl 2022-11-23T02:23:00.6962068Z test_reduce_scatter_non_contiguous (__main__.TestDistributedNNFunctionsNccl) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50409 2022-11-23T02:23:00.6962647Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50410 2022-11-23T02:23:00.6963266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6963708Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6964288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6964770Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6965353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:00.6965790Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:00.6966365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:00.6966836Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:00.6967333Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzysemuut 2022-11-23T02:23:00.6967868Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzysemuut/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6968413Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuk2zzm98 2022-11-23T02:23:00.6968962Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuk2zzm98/_remote_module_non_scriptable.py 2022-11-23T02:23:00.6969382Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6969797Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:00.6970297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:23:00.6970708Z INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:00.6971100Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:00.6971594Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:23:00.6972269Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6972950Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:23:00.6973349Z ok (5.380s) 2022-11-23T02:23:00.6973498Z 2022-11-23T02:23:00.6973768Z ---------------------------------------------------------------------- 2022-11-23T02:23:00.6974099Z Ran 1 test in 5.381s 2022-11-23T02:23:00.6974245Z 2022-11-23T02:23:00.6974338Z OK 2022-11-23T02:23:00.6974473Z 2022-11-23T02:23:00.6974656Z Generating XML reports... 2022-11-23T02:23:00.6975314Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022254.xml 2022-11-23T02:23:00.6975709Z 2022-11-23T02:23:00.6976080Z ##[endgroup] 2022-11-23T02:23:00.6976679Z FINISHED PRINTING LOG FILE of distributed/test_c10d_spawn_nccl (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_spawn_nccl_c6uqgh3v) 2022-11-23T02:23:00.6977030Z 2022-11-23T02:23:00.6977304Z Running distributed/_tensor/test_device_mesh ... [2022-11-23 02:23:00.676647] 2022-11-23T02:23:00.6977993Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/_tensor/test_device_mesh.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:23:00.676923] 2022-11-23T02:23:54.4192807Z 2022-11-23T02:23:54.4193947Z Expand the folded group to see the log file of distributed/_tensor/test_device_mesh 2022-11-23T02:23:54.4194941Z ##[group]PRINTING LOG FILE of distributed/_tensor/test_device_mesh (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_device_mesh_qcol5efi) 2022-11-23T02:23:54.4195322Z 2022-11-23T02:23:54.4195439Z Running tests... 2022-11-23T02:23:54.4195971Z ---------------------------------------------------------------------- 2022-11-23T02:23:54.4197426Z Test results will be stored in test-reports/python-unittest/distributed._tensor.test_device_mesh 2022-11-23T02:23:54.4197936Z test_all_gather_1d (__main__.DeviceMeshCollectiveTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:23:54.4198438Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50528 2022-11-23T02:23:54.4198902Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50529 2022-11-23T02:23:54.4199611Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50530 2022-11-23T02:23:54.4200050Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50531 2022-11-23T02:23:54.4200540Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 50532 2022-11-23T02:23:54.4201407Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 50533 2022-11-23T02:23:54.4202240Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 50534 2022-11-23T02:23:54.4203193Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 50535 2022-11-23T02:23:54.4204379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4205128Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4205708Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4206215Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4206803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4207298Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4207907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4208392Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4208974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4209437Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4209999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4210487Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4211074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4211787Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4212385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4212867Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4213455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4213900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4214487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4215156Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4216182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4216756Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4217915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4218802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4220191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4220991Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4221999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4222855Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4223946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4224741Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4225738Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4226603Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4227380Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4228358Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4229595Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4230090Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4230558Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4231026Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4231506Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4231975Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4232356Z skip: Need at least 8 CUDA devices (4.225s) 2022-11-23T02:23:54.4232846Z test_all_gather_nd (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 50800 2022-11-23T02:23:54.4233392Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 50801 2022-11-23T02:23:54.4233830Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 50802 2022-11-23T02:23:54.4234278Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 50803 2022-11-23T02:23:54.4234731Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 50804 2022-11-23T02:23:54.4235178Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 50805 2022-11-23T02:23:54.4235741Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 50806 2022-11-23T02:23:54.4236208Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 50807 2022-11-23T02:23:54.4236841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4237282Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4237868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4238344Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4238926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4239449Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4240032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4240510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4241080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4241533Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4242108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4242580Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4243144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4243603Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4244181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4244653Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4245218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4245667Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4246243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4246695Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4247277Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4247727Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4248306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4248755Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4249343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4249792Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4250351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4250818Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4251399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4251847Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4252410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4252881Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4253387Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4253882Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4254340Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4254810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4255283Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4255738Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4256203Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4256755Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4257157Z skip: Need at least 8 CUDA devices (2.614s) 2022-11-23T02:23:54.4257641Z test_all_gather_uneven (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51072 2022-11-23T02:23:54.4258187Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51073 2022-11-23T02:23:54.4258643Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51074 2022-11-23T02:23:54.4259073Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51075 2022-11-23T02:23:54.4259522Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 51076 2022-11-23T02:23:54.4259964Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 51077 2022-11-23T02:23:54.4260412Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 51078 2022-11-23T02:23:54.4260838Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 51079 2022-11-23T02:23:54.4261464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4261921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4262492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4262968Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4263550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4264001Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4264566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4265041Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4265621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4266073Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4266636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4267104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4267683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4268112Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4268688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4269662Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4270259Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4270785Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4271384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4271852Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4272417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4272862Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4273434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4273902Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4274554Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4275010Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4275585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4276055Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4276610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4277061Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4277637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4278085Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4278533Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4279064Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4279550Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4280009Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4280473Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4280947Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4281398Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4281868Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4282261Z skip: Need at least 8 CUDA devices (2.615s) 2022-11-23T02:23:54.4282756Z test_all_reduce_1d (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51344 2022-11-23T02:23:54.4283279Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51345 2022-11-23T02:23:54.4283738Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51346 2022-11-23T02:23:54.4284190Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51347 2022-11-23T02:23:54.4284618Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 51348 2022-11-23T02:23:54.4285065Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 51349 2022-11-23T02:23:54.4285506Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 51350 2022-11-23T02:23:54.4285950Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 51351 2022-11-23T02:23:54.4286555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4287017Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4287598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4288141Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4288722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4289173Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4289753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4290204Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4290788Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4291294Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4291872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4292329Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4292912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4293363Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4293922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4294386Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4294972Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4295422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4295980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4296452Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4297037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4297483Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4298043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4298511Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4299089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4299516Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4300096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4300563Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4301145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4301574Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4302151Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4302617Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4303042Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4303523Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4303999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4304474Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4304989Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4305471Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4305940Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4306411Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4306786Z skip: Need at least 8 CUDA devices (2.614s) 2022-11-23T02:23:54.4307281Z test_all_reduce_nd (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51616 2022-11-23T02:23:54.4307822Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51617 2022-11-23T02:23:54.4308319Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51618 2022-11-23T02:23:54.4308764Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51619 2022-11-23T02:23:54.4309779Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 51620 2022-11-23T02:23:54.4310232Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 51621 2022-11-23T02:23:54.4310660Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 51622 2022-11-23T02:23:54.4311099Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 51623 2022-11-23T02:23:54.4311723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4312164Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4312747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4313224Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4313812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4314248Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4314828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4315302Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4315872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4316320Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4316898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4317372Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4317938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4318395Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4318971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4319440Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4320004Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4320454Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4321028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4321480Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4322070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4322521Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4323193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4323664Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4324247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4324695Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4325255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4325722Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4326408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4326857Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4327421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4327893Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4328337Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4328813Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4329273Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4329739Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4330207Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4330665Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4331133Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4331602Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4332001Z skip: Need at least 8 CUDA devices (2.615s) 2022-11-23T02:23:54.4332470Z test_all_to_all_1d (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 51888 2022-11-23T02:23:54.4333010Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 51889 2022-11-23T02:23:54.4333462Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 51890 2022-11-23T02:23:54.4333895Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 51891 2022-11-23T02:23:54.4334349Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 51892 2022-11-23T02:23:54.4334792Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 51893 2022-11-23T02:23:54.4335240Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 51894 2022-11-23T02:23:54.4335668Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 51895 2022-11-23T02:23:54.4336285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4336745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4337312Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4337789Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4338367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4338817Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4339381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4339914Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4340508Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4340953Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4341510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4341981Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4342557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4343048Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4343625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4344093Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4344673Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4345099Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4345675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4346145Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4346709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4347155Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4347735Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4348205Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4348775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4349712Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4350299Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4350771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4351332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4351782Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4352364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4352814Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4353261Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4353745Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4354222Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4354679Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4355144Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4355615Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4356069Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4356543Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4356937Z skip: Need at least 8 CUDA devices (2.615s) 2022-11-23T02:23:54.4357514Z test_all_to_all_nd (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52160 2022-11-23T02:23:54.4358054Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52161 2022-11-23T02:23:54.4358508Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52162 2022-11-23T02:23:54.4358959Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52163 2022-11-23T02:23:54.4359393Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 52164 2022-11-23T02:23:54.4359841Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 52165 2022-11-23T02:23:54.4360285Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 52166 2022-11-23T02:23:54.4360803Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 52167 2022-11-23T02:23:54.4361410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4361869Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4362449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4362924Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4363492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4363942Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4364516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4364979Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4365562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4366015Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4366595Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4367045Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4367628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4368076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4368632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4369103Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4369693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4370139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4370703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4371173Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4371756Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4372201Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4372755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4373225Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4373810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4374238Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4374891Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4375373Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4375960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4376392Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4376966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4377436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4377922Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4378403Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4378920Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4379400Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4379859Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4380322Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4380793Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4381242Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4381636Z skip: Need at least 8 CUDA devices (2.615s) 2022-11-23T02:23:54.4382129Z test_broadcast_1d (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52432 2022-11-23T02:23:54.4382669Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52433 2022-11-23T02:23:54.4383108Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52434 2022-11-23T02:23:54.4383551Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52435 2022-11-23T02:23:54.4383999Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 52436 2022-11-23T02:23:54.4384444Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 52437 2022-11-23T02:23:54.4384872Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 52438 2022-11-23T02:23:54.4385312Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 52439 2022-11-23T02:23:54.4385934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4386380Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4386964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4387449Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4388035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4388467Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4389560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4390059Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4390632Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4391086Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4391663Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4392133Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4392796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4393261Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4393845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4394318Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4394883Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4395334Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4395990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4396444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4397030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4397478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4398054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4398505Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4399090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4399538Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4400103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4400572Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4401154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4401598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4402161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4402632Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4403073Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4403537Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4404014Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4404488Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4404960Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4405417Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4405882Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4406349Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4406744Z skip: Need at least 8 CUDA devices (2.614s) 2022-11-23T02:23:54.4407215Z test_broadcast_nd (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52704 2022-11-23T02:23:54.4407757Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52705 2022-11-23T02:23:54.4408216Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52706 2022-11-23T02:23:54.4408651Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52707 2022-11-23T02:23:54.4409094Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 52708 2022-11-23T02:23:54.4409602Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 52709 2022-11-23T02:23:54.4410058Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 52710 2022-11-23T02:23:54.4410481Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 52711 2022-11-23T02:23:54.4411095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4411551Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4412106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4412618Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4413198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4413678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4414253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4414723Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4415304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4415731Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4416318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4416785Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4417370Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4417800Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4418381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4418851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4419427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4419857Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4420432Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4420898Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4421465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4421910Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4422495Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4422962Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4423527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4423984Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4424557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4425007Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4425590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4426040Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4426671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4427130Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4427578Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4428062Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4428540Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4429431Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4429935Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4430507Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4430961Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4431432Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4431825Z skip: Need at least 8 CUDA devices (2.614s) 2022-11-23T02:23:54.4432320Z test_reduce_scatter_1d (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 52976 2022-11-23T02:23:54.4432855Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 52977 2022-11-23T02:23:54.4433308Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 52978 2022-11-23T02:23:54.4433765Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 52979 2022-11-23T02:23:54.4434199Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 52980 2022-11-23T02:23:54.4434654Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 52981 2022-11-23T02:23:54.4435099Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 52982 2022-11-23T02:23:54.4435548Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 52983 2022-11-23T02:23:54.4436366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4436826Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4437407Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4437886Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4438455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4438908Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4439490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4439952Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4440539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4440990Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4441565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4442019Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4442596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4443039Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4443604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4444078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4444737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4445195Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4445760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4446228Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4446804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4447247Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4447893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4448360Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4448938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4449363Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4449940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4450408Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4450989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4451418Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4451992Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4452461Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4452884Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4453365Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4453833Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4454314Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4454771Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4455233Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4455707Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4456178Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4456557Z skip: Need at least 8 CUDA devices (2.614s) 2022-11-23T02:23:54.4457060Z test_reduce_scatter_nd (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53248 2022-11-23T02:23:54.4457611Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53249 2022-11-23T02:23:54.4458045Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53250 2022-11-23T02:23:54.4458490Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53251 2022-11-23T02:23:54.4458938Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 53252 2022-11-23T02:23:54.4459383Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 53253 2022-11-23T02:23:54.4459810Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 53254 2022-11-23T02:23:54.4460254Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 53255 2022-11-23T02:23:54.4460870Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4461369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4461968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4462447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4463037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4463473Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4464052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4464587Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4465154Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4465603Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4466184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4466654Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4467219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4467670Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4468247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4468714Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4469881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4470335Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4470918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4471375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4471959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4472408Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4472984Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4473439Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4474026Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4474471Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4475032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4475502Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4476082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4476529Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4477089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4477557Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4477999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4478483Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4478990Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4479559Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4480043Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4480501Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4480966Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4481434Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4481829Z skip: Need at least 8 CUDA devices (2.615s) 2022-11-23T02:23:54.4482314Z test_reduce_scatter_uneven (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53520 2022-11-23T02:23:54.4482963Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53521 2022-11-23T02:23:54.4483422Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53522 2022-11-23T02:23:54.4483855Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53523 2022-11-23T02:23:54.4484302Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 53524 2022-11-23T02:23:54.4484745Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 53525 2022-11-23T02:23:54.4485188Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 53526 2022-11-23T02:23:54.4485615Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 53527 2022-11-23T02:23:54.4486237Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4486698Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4487269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4487757Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4488342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4488792Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4489354Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4489829Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4490409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4490856Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4491424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4491894Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4492536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4492973Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4493551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4494019Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4494596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4495022Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4495606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4496072Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4496703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4497164Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4497742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4498211Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4498774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4499228Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4499791Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4500305Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4500875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4501352Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4501941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4502393Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4502838Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4503322Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4503797Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4504262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4504728Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4505202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4505654Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4506133Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4506524Z skip: Need at least 8 CUDA devices (2.615s) 2022-11-23T02:23:54.4507008Z test_scatter_1d (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 53792 2022-11-23T02:23:54.4507527Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 53793 2022-11-23T02:23:54.4507984Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 53794 2022-11-23T02:23:54.4508439Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 53795 2022-11-23T02:23:54.4508870Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 53796 2022-11-23T02:23:54.4509504Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 53797 2022-11-23T02:23:54.4509952Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 53798 2022-11-23T02:23:54.4510398Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 53799 2022-11-23T02:23:54.4511003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4511459Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4512038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4512528Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4513115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4513551Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4514219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4514710Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4515285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4515736Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4516314Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4516785Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4517433Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4517888Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4518455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4518911Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4519471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4519941Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4520534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4520986Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4521575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4522023Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4522603Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4523059Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4523644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4524091Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4524645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4525118Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4525698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4526152Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4526712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4527182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4527625Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4528107Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4528566Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4529031Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4529505Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4529963Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4530426Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4530981Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4531386Z skip: Need at least 8 CUDA devices (2.614s) 2022-11-23T02:23:54.4531855Z test_scatter_nd (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54064 2022-11-23T02:23:54.4532392Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54065 2022-11-23T02:23:54.4532847Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54066 2022-11-23T02:23:54.4533280Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54067 2022-11-23T02:23:54.4533724Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 54068 2022-11-23T02:23:54.4534312Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 54069 2022-11-23T02:23:54.4534759Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 54070 2022-11-23T02:23:54.4535197Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 54071 2022-11-23T02:23:54.4535817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4536275Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4536841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4537316Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4537906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4538360Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4538928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4539403Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4539997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4540445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4540995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4541449Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4542031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4542487Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4543095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4543567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4544158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4544592Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4545174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4545647Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4546212Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4546659Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4547238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4547717Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4548341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4548806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4549628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4550100Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4550666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4551116Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4551691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4552243Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4552689Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4553174Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4553654Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4554110Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4554573Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4555046Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4555493Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4555969Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4556362Z skip: Need at least 8 CUDA devices (2.614s) 2022-11-23T02:23:54.4556864Z test_scatter_uneven (__main__.DeviceMeshCollectiveTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54336 2022-11-23T02:23:54.4557394Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54337 2022-11-23T02:23:54.4557847Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54338 2022-11-23T02:23:54.4558300Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54339 2022-11-23T02:23:54.4558733Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 54340 2022-11-23T02:23:54.4559180Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 54341 2022-11-23T02:23:54.4559620Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 54342 2022-11-23T02:23:54.4560069Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 54343 2022-11-23T02:23:54.4560675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4561133Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4561719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4562182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4562770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4563222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4563799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4564255Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4564849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4565297Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4565955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4566423Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4567016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4567465Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4568018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4568490Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4569150Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4569598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4570163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4570639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4571223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4571653Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4572231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4572701Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4573282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4573717Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4574306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4574775Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4575356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4575787Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4576367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4576836Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4577259Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4577748Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4578218Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4578702Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4579210Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4579680Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4580157Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4580610Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4581010Z skip: Need at least 8 CUDA devices (2.715s) 2022-11-23T02:23:54.4581475Z test_device_mesh_2d (__main__.DeviceMeshTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54608 2022-11-23T02:23:54.4581999Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54609 2022-11-23T02:23:54.4582434Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54610 2022-11-23T02:23:54.4582938Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54611 2022-11-23T02:23:54.4583404Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 54612 2022-11-23T02:23:54.4583834Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 54613 2022-11-23T02:23:54.4584283Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 54614 2022-11-23T02:23:54.4584727Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 54615 2022-11-23T02:23:54.4585349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4585786Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4586435Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4586915Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4587504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4587937Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4588517Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4589151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4589737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4590187Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4590770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4591242Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4591812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4592266Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4592847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4593299Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4593882Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4594334Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4594914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4595372Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4595961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4596408Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4596989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4597443Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4598027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4598476Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4599039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4599514Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4600095Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4600624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4601202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4601674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4602119Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4602584Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4603061Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4603609Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4604083Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4604544Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4605006Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4605477Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4605868Z skip: Need at least 8 CUDA devices (2.615s) 2022-11-23T02:23:54.4606338Z test_device_mesh_2d_from_dim_groups (__main__.DeviceMeshTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 54880 2022-11-23T02:23:54.4606881Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 54881 2022-11-23T02:23:54.4607337Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 54882 2022-11-23T02:23:54.4607777Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 54883 2022-11-23T02:23:54.4608222Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 54884 2022-11-23T02:23:54.4608675Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 54885 2022-11-23T02:23:54.4609123Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 54886 2022-11-23T02:23:54.4609551Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 54887 2022-11-23T02:23:54.4610177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4610635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4611199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4611677Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4612268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4612719Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4613283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4613757Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4614341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4614768Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4615350Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4615820Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4616405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4616836Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4617465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4617928Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4618514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4618972Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4619563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4620032Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4620594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4621113Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4621690Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4622158Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4622721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4623178Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4623757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4624203Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4624787Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4625239Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4625816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4626271Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4626719Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4627206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4627683Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4628141Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4628608Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4629305Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4629777Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4630244Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4630642Z skip: Need at least 8 CUDA devices (2.615s) 2022-11-23T02:23:54.4631135Z test_device_mesh_dim_groups_error (__main__.DeviceMeshTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55152 2022-11-23T02:23:54.4631657Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55153 2022-11-23T02:23:54.4632112Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 55154 2022-11-23T02:23:54.4632564Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 55155 2022-11-23T02:23:54.4632995Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 55156 2022-11-23T02:23:54.4633451Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 55157 2022-11-23T02:23:54.4633895Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 55158 2022-11-23T02:23:54.4634338Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 55159 2022-11-23T02:23:54.4635023Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4635495Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4636081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4636539Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4637121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4637574Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4638236Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4638691Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4639283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4639732Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4640310Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4640762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4641344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4641794Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4642356Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4642827Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4643416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4643865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4644418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4644896Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4645479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4645908Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4646488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4646964Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4647550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4647982Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4648560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4649029Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4649612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4650042Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4650622Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4651099Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4651525Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4652066Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4652556Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4653036Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4653491Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4653955Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4654425Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4654875Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4655328Z skip: Need at least 8 CUDA devices (2.615s) 2022-11-23T02:23:54.4655794Z test_device_mesh_nd (__main__.DeviceMeshTest) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55424 2022-11-23T02:23:54.4656319Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55425 2022-11-23T02:23:54.4656760Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 55426 2022-11-23T02:23:54.4657208Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 55427 2022-11-23T02:23:54.4657661Z INFO:torch.testing._internal.common_distributed:Started process 4 with pid 55428 2022-11-23T02:23:54.4658093Z INFO:torch.testing._internal.common_distributed:Started process 5 with pid 55429 2022-11-23T02:23:54.4658544Z INFO:torch.testing._internal.common_distributed:Started process 6 with pid 55430 2022-11-23T02:23:54.4658985Z INFO:torch.testing._internal.common_distributed:Started process 7 with pid 55431 2022-11-23T02:23:54.4659612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4660050Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4660636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4661110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4661702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4662136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4662716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4663196Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4663767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4664223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4664805Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4665278Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4665839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4666289Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4666872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4667327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4667914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4668372Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4669174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4669658Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4670250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4670698Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4671278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4671734Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4672318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4672850Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4673418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4673894Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4674478Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:23:54.4674927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:23:54.4675492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:23:54.4675958Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:23:54.4676401Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:23:54.4676869Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 6 2022-11-23T02:23:54.4677345Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 4 2022-11-23T02:23:54.4677812Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:23:54.4678287Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:23:54.4678741Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 7 2022-11-23T02:23:54.4679255Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:23:54.4679727Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 5 2022-11-23T02:23:54.4680104Z skip: Need at least 8 CUDA devices (2.616s) 2022-11-23T02:23:54.4680304Z 2022-11-23T02:23:54.4680583Z ---------------------------------------------------------------------- 2022-11-23T02:23:54.4680927Z Ran 19 tests in 51.392s 2022-11-23T02:23:54.4681097Z 2022-11-23T02:23:54.4681207Z OK (skipped=19) 2022-11-23T02:23:54.4681347Z 2022-11-23T02:23:54.4681473Z Generating XML reports... 2022-11-23T02:23:54.4682107Z Generated XML report: test-reports/python-unittest/distributed._tensor.test_device_mesh/TEST-DeviceMeshCollectiveTest-20221123022302.xml 2022-11-23T02:23:54.4682888Z Generated XML report: test-reports/python-unittest/distributed._tensor.test_device_mesh/TEST-DeviceMeshTest-20221123022302.xml 2022-11-23T02:23:54.4683234Z 2022-11-23T02:23:54.4683615Z ##[endgroup] 2022-11-23T02:23:54.4684233Z FINISHED PRINTING LOG FILE of distributed/_tensor/test_device_mesh (/var/lib/jenkins/workspace/test/test-reports/distributed-_tensor-test_device_mesh_qcol5efi) 2022-11-23T02:23:54.4684595Z 2022-11-23T02:23:54.4684852Z Running distributed/test_pg_wrapper ... [2022-11-23 02:23:54.420100] 2022-11-23T02:23:54.4685556Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_pg_wrapper.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:23:54.420454] 2022-11-23T02:25:34.8870089Z 2022-11-23T02:25:34.8870703Z Expand the folded group to see the log file of distributed/test_pg_wrapper 2022-11-23T02:25:34.8871917Z ##[group]PRINTING LOG FILE of distributed/test_pg_wrapper (/var/lib/jenkins/workspace/test/test-reports/distributed-test_pg_wrapper_v631wbzi) 2022-11-23T02:25:34.8872680Z 2022-11-23T02:25:34.8875617Z 2022-11-23T02:25:34.8877177Z , <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch_cuda>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch_cuda_debug_mode>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collective_shape_mismatch_debug_mode>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch_cuda>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch_cuda_debug_mode>, <__main__.ProcessGroupGlooWrapperTest testMethod=test_collectives_op_mismatch_debug_mode>]> 2022-11-23T02:25:34.8879038Z test_collective_hang (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:25:34.8879500Z test_collective_shape_mismatch (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:25:34.8879945Z test_collective_shape_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:25:34.8880427Z test_collective_shape_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:25:34.8880915Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:25:34.8881357Z test_collectives_op_mismatch (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:25:34.8881809Z test_collectives_op_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:25:34.8882287Z test_collectives_op_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:25:34.8882768Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) 2022-11-23T02:25:34.8883805Z , <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collective_shape_mismatch>, <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collective_shape_mismatch_debug_mode>, <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collectives_op_mismatch>, <__main__.ProcessGroupNCCLWrapperTest testMethod=test_collectives_op_mismatch_debug_mode>]> 2022-11-23T02:25:34.8884798Z test_collective_hang (__main__.ProcessGroupNCCLWrapperTest) 2022-11-23T02:25:34.8885231Z test_collective_shape_mismatch (__main__.ProcessGroupNCCLWrapperTest) 2022-11-23T02:25:34.8885693Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) 2022-11-23T02:25:34.8886170Z test_collectives_op_mismatch (__main__.ProcessGroupNCCLWrapperTest) 2022-11-23T02:25:34.8886614Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) 2022-11-23T02:25:34.8887383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8887847Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8888417Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8888901Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8889138Z 2022-11-23T02:25:34.8889248Z Running tests... 2022-11-23T02:25:34.8889664Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.8890185Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:25:34.8890723Z test_collective_hang (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:34.8891247Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55764 2022-11-23T02:25:34.8891789Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55765 2022-11-23T02:25:34.8892246Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 55766 2022-11-23T02:25:34.8892696Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 55767 2022-11-23T02:25:34.8893328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8893772Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8894353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8894834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8895480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8895917Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8896500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8896977Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8897564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8898004Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8898586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8899059Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8899637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8900086Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8900674Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8901151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8902048Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:34.8902542Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:34.8903020Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:34.8903500Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:25:34.8903970Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:34.8904497Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:25:34.8904995Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:34.8905497Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:25:34.8906159Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.8906873Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.8907574Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.8908264Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.8908777Z [E ProcessGroupGloo.cpp:2802] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 2000 ms 2022-11-23T02:25:34.8909958Z [E ProcessGroupGloo.cpp:137] [Rank 0]: Ranks 1 failed to pass monitoredBarrier in 2000 ms 2022-11-23T02:25:34.8928748Z [E ProcessGroupGloo.cpp:137] Rank 2 successfully reached monitoredBarrier, but received errors while waiting for send/recv from rank 0. Please check rank 0 logs for faulty rank. 2022-11-23T02:25:34.8929517Z [E ProcessGroupGloo.cpp:137] Rank 3 successfully reached monitoredBarrier, but received errors while waiting for send/recv from rank 0. Please check rank 0 logs for faulty rank. 2022-11-23T02:25:34.8929961Z ok (4.277s) 2022-11-23T02:25:34.8930116Z 2022-11-23T02:25:34.8930435Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.8930766Z Ran 1 test in 4.277s 2022-11-23T02:25:34.8930934Z 2022-11-23T02:25:34.8931010Z OK 2022-11-23T02:25:34.8931145Z 2022-11-23T02:25:34.8931275Z Generating XML reports... 2022-11-23T02:25:34.8932015Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022358.xml 2022-11-23T02:25:34.8932764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8933223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8933814Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8934295Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8934532Z 2022-11-23T02:25:34.8934640Z Running tests... 2022-11-23T02:25:34.8935029Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.8935565Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:25:34.8936354Z test_collective_shape_mismatch (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:34.8936871Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 55971 2022-11-23T02:25:34.8937334Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 55972 2022-11-23T02:25:34.8937797Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 55973 2022-11-23T02:25:34.8938251Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 55974 2022-11-23T02:25:34.8938860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8939323Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8939909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8940370Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8940963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8941413Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8941996Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8942454Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8943041Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8943495Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8944053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8944524Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8945108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8945561Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8946116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8946654Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8947118Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:34.8947602Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:25:34.8948060Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:34.8948526Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:34.8949652Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:34.8950305Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:34.8950817Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:25:34.8951328Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:25:34.8952012Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.8952703Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.8953394Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.8954086Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.8954485Z ok (4.233s) 2022-11-23T02:25:34.8954617Z 2022-11-23T02:25:34.8954889Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.8955217Z Ran 1 test in 4.233s 2022-11-23T02:25:34.8955381Z 2022-11-23T02:25:34.8955474Z OK 2022-11-23T02:25:34.8955611Z 2022-11-23T02:25:34.8955721Z Generating XML reports... 2022-11-23T02:25:34.8956359Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022404.xml 2022-11-23T02:25:34.8957105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8957563Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8958133Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8958614Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8958855Z 2022-11-23T02:25:34.8958964Z Running tests... 2022-11-23T02:25:34.8959356Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.8959892Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:25:34.8960446Z test_collective_shape_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:34.8960970Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56178 2022-11-23T02:25:34.8961412Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56179 2022-11-23T02:25:34.8961858Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 56180 2022-11-23T02:25:34.8962313Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 56181 2022-11-23T02:25:34.8962912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8963378Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8963961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8964437Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8965082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8965553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8966134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8966613Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8967177Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8967628Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8968269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8968723Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8969315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8969762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8970337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8970789Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8971232Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:25:34.8971712Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:34.8972172Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:34.8972648Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:34.8973044Z skip: Need at least 4 CUDA devices (3.931s) 2022-11-23T02:25:34.8973240Z 2022-11-23T02:25:34.8973515Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.8973827Z Ran 1 test in 3.931s 2022-11-23T02:25:34.8973988Z 2022-11-23T02:25:34.8974098Z OK (skipped=1) 2022-11-23T02:25:34.8974257Z 2022-11-23T02:25:34.8974381Z Generating XML reports... 2022-11-23T02:25:34.8974995Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022411.xml 2022-11-23T02:25:34.8975741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8976200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8976782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8977240Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8977522Z 2022-11-23T02:25:34.8977634Z Running tests... 2022-11-23T02:25:34.8978046Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.8978587Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:25:34.8979130Z test_collective_shape_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:34.8979667Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56349 2022-11-23T02:25:34.8980127Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56350 2022-11-23T02:25:34.8980566Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 56351 2022-11-23T02:25:34.8981022Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 56352 2022-11-23T02:25:34.8981642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8982155Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8982734Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8983212Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8983795Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8984227Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8984807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8985360Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8985948Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8986379Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8986958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8987434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8988017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8988446Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8989810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8990309Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8990745Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:34.8991229Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:34.8991712Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:34.8992188Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:25:34.8992567Z skip: Need at least 4 CUDA devices (4.016s) 2022-11-23T02:25:34.8992764Z 2022-11-23T02:25:34.8993049Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.8993381Z Ran 1 test in 4.016s 2022-11-23T02:25:34.8993544Z 2022-11-23T02:25:34.8993634Z OK (skipped=1) 2022-11-23T02:25:34.8993789Z 2022-11-23T02:25:34.8993915Z Generating XML reports... 2022-11-23T02:25:34.8994552Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022417.xml 2022-11-23T02:25:34.8995304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.8995746Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.8996330Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.8996809Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.8997044Z 2022-11-23T02:25:34.8997135Z Running tests... 2022-11-23T02:25:34.8997542Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.8998078Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:25:34.8998637Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:34.8999154Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56520 2022-11-23T02:25:34.8999619Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56521 2022-11-23T02:25:34.9000077Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 56522 2022-11-23T02:25:34.9000606Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 56523 2022-11-23T02:25:34.9001241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9001699Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9002281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9002744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9003329Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9003857Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9004439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9004903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9005488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9005936Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9006498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9006970Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9007551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9007996Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9008563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9009031Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9009478Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:34.9009945Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:34.9010426Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:34.9010898Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:25:34.9011392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:34.9011884Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:34.9012392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:25:34.9012893Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:25:34.9013565Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.9014250Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.9014952Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.9015648Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.9016187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:25:34.9016676Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:25:34.9017176Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T02:25:34.9017730Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T02:25:34.9018386Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:25:34.9019086Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:25:34.9019778Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:25:34.9020470Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:25:34.9020845Z ok (4.268s) 2022-11-23T02:25:34.9021074Z 2022-11-23T02:25:34.9021345Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9021678Z Ran 1 test in 4.268s 2022-11-23T02:25:34.9021841Z 2022-11-23T02:25:34.9021935Z OK 2022-11-23T02:25:34.9022051Z 2022-11-23T02:25:34.9022180Z Generating XML reports... 2022-11-23T02:25:34.9022813Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022423.xml 2022-11-23T02:25:34.9023551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9023991Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9024570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9025047Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9025278Z 2022-11-23T02:25:34.9025391Z Running tests... 2022-11-23T02:25:34.9025781Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9026315Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:25:34.9026856Z test_collectives_op_mismatch (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:34.9027357Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56739 2022-11-23T02:25:34.9027816Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56740 2022-11-23T02:25:34.9028272Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 56741 2022-11-23T02:25:34.9028727Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 56742 2022-11-23T02:25:34.9030088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9030551Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9031142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9031600Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9032191Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9032642Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9033219Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9033674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9034259Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9034710Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9035270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9035744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9036420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9036886Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9037446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9037921Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9038370Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:34.9038852Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:25:34.9039314Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:34.9039856Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:34.9040349Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:34.9040843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:34.9041348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:25:34.9041852Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:25:34.9042518Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.9043201Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.9043901Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.9044590Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.9044986Z ok (4.273s) 2022-11-23T02:25:34.9045119Z 2022-11-23T02:25:34.9045393Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9045721Z Ran 1 test in 4.273s 2022-11-23T02:25:34.9045886Z 2022-11-23T02:25:34.9045980Z OK 2022-11-23T02:25:34.9046116Z 2022-11-23T02:25:34.9046221Z Generating XML reports... 2022-11-23T02:25:34.9046852Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022430.xml 2022-11-23T02:25:34.9047594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9048047Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9048615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9049093Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9049327Z 2022-11-23T02:25:34.9049438Z Running tests... 2022-11-23T02:25:34.9049826Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9050360Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:25:34.9050899Z test_collectives_op_mismatch_cuda (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:34.9051424Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 56946 2022-11-23T02:25:34.9051866Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 56947 2022-11-23T02:25:34.9052316Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 56948 2022-11-23T02:25:34.9052779Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 56949 2022-11-23T02:25:34.9053381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9053898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9054494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9054971Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9055538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9055987Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9056564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9057094Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9057658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9058108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9058691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9059144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9059706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9060147Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9060706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9061145Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9061570Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:34.9062042Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:34.9062506Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:25:34.9062986Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:34.9063375Z skip: Need at least 4 CUDA devices (4.064s) 2022-11-23T02:25:34.9063571Z 2022-11-23T02:25:34.9063846Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9064158Z Ran 1 test in 4.064s 2022-11-23T02:25:34.9064318Z 2022-11-23T02:25:34.9064426Z OK (skipped=1) 2022-11-23T02:25:34.9064582Z 2022-11-23T02:25:34.9064706Z Generating XML reports... 2022-11-23T02:25:34.9065318Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022436.xml 2022-11-23T02:25:34.9066062Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9066518Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9067101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9067562Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9067797Z 2022-11-23T02:25:34.9067904Z Running tests... 2022-11-23T02:25:34.9068308Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9068841Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:25:34.9069862Z test_collectives_op_mismatch_cuda_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:34.9070410Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57117 2022-11-23T02:25:34.9070868Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57118 2022-11-23T02:25:34.9071308Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 57119 2022-11-23T02:25:34.9071830Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 57120 2022-11-23T02:25:34.9072466Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9072927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9073490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9073965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9074555Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9075058Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9075629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9076087Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9076664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9077125Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9077761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9078236Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9078804Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9079259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9079839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9080313Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9080740Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:34.9081219Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:34.9081690Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:34.9082160Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:25:34.9082543Z skip: Need at least 4 CUDA devices (3.956s) 2022-11-23T02:25:34.9082738Z 2022-11-23T02:25:34.9083010Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9083344Z Ran 1 test in 3.956s 2022-11-23T02:25:34.9083508Z 2022-11-23T02:25:34.9083598Z OK (skipped=1) 2022-11-23T02:25:34.9083751Z 2022-11-23T02:25:34.9083875Z Generating XML reports... 2022-11-23T02:25:34.9084511Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022443.xml 2022-11-23T02:25:34.9085253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9085693Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9086273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9086747Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9086979Z 2022-11-23T02:25:34.9087070Z Running tests... 2022-11-23T02:25:34.9087479Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9088016Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:25:34.9088572Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupGlooWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:34.9089143Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57288 2022-11-23T02:25:34.9089613Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57289 2022-11-23T02:25:34.9090063Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 57290 2022-11-23T02:25:34.9090492Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 57291 2022-11-23T02:25:34.9091107Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9091564Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9092146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9092660Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9093241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9093697Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9094284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9094739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9095321Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9095770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9096327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9096802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9097383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9097831Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9098392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9098857Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9099298Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:25:34.9099760Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:25:34.9100235Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:34.9100703Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:34.9101198Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:34.9101683Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:25:34.9102182Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:25:34.9102676Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:34.9103342Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.9104027Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.9104719Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.9105423Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:25:34.9105959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:25:34.9106499Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:25:34.9107004Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 3 2022-11-23T02:25:34.9107500Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 2 2022-11-23T02:25:34.9108140Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:25:34.9108835Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:25:34.9110046Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:25:34.9110837Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:2 with 4 nodes. 2022-11-23T02:25:34.9111214Z ok (4.373s) 2022-11-23T02:25:34.9111367Z 2022-11-23T02:25:34.9111636Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9111967Z Ran 1 test in 4.373s 2022-11-23T02:25:34.9112128Z 2022-11-23T02:25:34.9112218Z OK 2022-11-23T02:25:34.9112335Z 2022-11-23T02:25:34.9112460Z Generating XML reports... 2022-11-23T02:25:34.9113094Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022449.xml 2022-11-23T02:25:34.9113834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9114274Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9114861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9115339Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9115571Z 2022-11-23T02:25:34.9115683Z Running tests... 2022-11-23T02:25:34.9116069Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9116603Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:25:34.9117126Z test_collective_hang (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:34.9117618Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57507 2022-11-23T02:25:34.9118074Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57508 2022-11-23T02:25:34.9118692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9119157Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9119719Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9120191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9120771Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9121195Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9121775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9122244Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9122689Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:34.9123154Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:34.9123648Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:34.9124151Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:34.9124886Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:34.9125582Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:34.9126114Z [E ProcessGroupGloo.cpp:2802] [Rank 0]: Rank 1 failed to pass monitoredBarrier in 2000 ms 2022-11-23T02:25:34.9126597Z [E ProcessGroupGloo.cpp:137] [Rank 0]: Ranks 1 failed to pass monitoredBarrier in 2000 ms 2022-11-23T02:25:34.9126931Z ok (3.965s) 2022-11-23T02:25:34.9127082Z 2022-11-23T02:25:34.9127355Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9127736Z Ran 1 test in 3.965s 2022-11-23T02:25:34.9127901Z 2022-11-23T02:25:34.9127993Z OK 2022-11-23T02:25:34.9128109Z 2022-11-23T02:25:34.9128232Z Generating XML reports... 2022-11-23T02:25:34.9128872Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123022456.xml 2022-11-23T02:25:34.9129614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9130056Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9130639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9131114Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9131347Z 2022-11-23T02:25:34.9131457Z Running tests... 2022-11-23T02:25:34.9131844Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9132386Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:25:34.9132930Z test_collective_shape_mismatch (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:34.9133439Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57620 2022-11-23T02:25:34.9133899Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57621 2022-11-23T02:25:34.9134515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9134974Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9135538Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9136008Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9136596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9137055Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9137621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9138092Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9138537Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:34.9139019Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:34.9139509Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:34.9140003Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:34.9140670Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:34.9141360Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:34.9141755Z ok (5.644s) 2022-11-23T02:25:34.9141904Z 2022-11-23T02:25:34.9142232Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9142552Z Ran 1 test in 5.644s 2022-11-23T02:25:34.9142717Z 2022-11-23T02:25:34.9142810Z OK 2022-11-23T02:25:34.9142944Z 2022-11-23T02:25:34.9143069Z Generating XML reports... 2022-11-23T02:25:34.9143701Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123022502.xml 2022-11-23T02:25:34.9144427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9144885Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9145528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9145987Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9146220Z 2022-11-23T02:25:34.9146330Z Running tests... 2022-11-23T02:25:34.9146739Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9147279Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:25:34.9147818Z test_collective_shape_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:34.9148347Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57749 2022-11-23T02:25:34.9148806Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57750 2022-11-23T02:25:34.9149867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9150331Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9150912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9151391Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9151959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9152415Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9152991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9153463Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9153892Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:34.9154381Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:34.9154883Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:34.9155369Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:34.9156040Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:34.9156741Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:34.9157285Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:25:34.9157771Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:25:34.9158429Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:34.9159129Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:34.9159529Z ok (5.777s) 2022-11-23T02:25:34.9159660Z 2022-11-23T02:25:34.9160012Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9160357Z Ran 1 test in 5.777s 2022-11-23T02:25:34.9160522Z 2022-11-23T02:25:34.9160614Z OK 2022-11-23T02:25:34.9160748Z 2022-11-23T02:25:34.9160856Z Generating XML reports... 2022-11-23T02:25:34.9161494Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123022510.xml 2022-11-23T02:25:34.9162232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9162686Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9163253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9163798Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9164031Z 2022-11-23T02:25:34.9164142Z Running tests... 2022-11-23T02:25:34.9164538Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9165073Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:25:34.9165610Z test_collectives_op_mismatch (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:34.9166129Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 57888 2022-11-23T02:25:34.9166571Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 57889 2022-11-23T02:25:34.9167184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9167640Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9168209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9168693Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9169283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9169733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9170295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9170765Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9171208Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:34.9171689Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:34.9172173Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:34.9172679Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:34.9173345Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:34.9174027Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:34.9174431Z ok (6.721s) 2022-11-23T02:25:34.9174580Z 2022-11-23T02:25:34.9174902Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9175241Z Ran 1 test in 6.721s 2022-11-23T02:25:34.9175383Z 2022-11-23T02:25:34.9175477Z OK 2022-11-23T02:25:34.9175612Z 2022-11-23T02:25:34.9175737Z Generating XML reports... 2022-11-23T02:25:34.9176369Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123022518.xml 2022-11-23T02:25:34.9177101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9177591Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9178249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9178737Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9178970Z 2022-11-23T02:25:34.9179059Z Running tests... 2022-11-23T02:25:34.9179466Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9180004Z Test results will be stored in test-reports/python-unittest/distributed.test_pg_wrapper 2022-11-23T02:25:34.9180540Z test_collectives_op_mismatch_debug_mode (__main__.ProcessGroupNCCLWrapperTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:25:34.9181070Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58018 2022-11-23T02:25:34.9181587Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58019 2022-11-23T02:25:34.9182206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9182648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9183231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9183706Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9184289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:25:34.9184726Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:25:34.9185304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:25:34.9185781Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:25:34.9186210Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:25:34.9186694Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:25:34.9187185Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:25:34.9187689Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:25:34.9188341Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:34.9189437Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:25:34.9189998Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:25:34.9190507Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:25:34.9191156Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:34.9191862Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:25:34.9192264Z ok (6.815s) 2022-11-23T02:25:34.9192415Z 2022-11-23T02:25:34.9192663Z ---------------------------------------------------------------------- 2022-11-23T02:25:34.9192992Z Ran 1 test in 6.815s 2022-11-23T02:25:34.9193154Z 2022-11-23T02:25:34.9193247Z OK 2022-11-23T02:25:34.9193380Z 2022-11-23T02:25:34.9193503Z Generating XML reports... 2022-11-23T02:25:34.9194114Z Generated XML report: test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123022527.xml 2022-11-23T02:25:34.9194506Z 2022-11-23T02:25:34.9194941Z ##[endgroup] 2022-11-23T02:25:34.9195518Z FINISHED PRINTING LOG FILE of distributed/test_pg_wrapper (/var/lib/jenkins/workspace/test/test-reports/distributed-test_pg_wrapper_v631wbzi) 2022-11-23T02:25:34.9195862Z 2022-11-23T02:25:34.9196208Z Running distributed/fsdp/test_fsdp_comm_hooks ... [2022-11-23 02:25:34.887809] 2022-11-23T02:25:34.9196925Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_comm_hooks.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:25:34.888179] 2022-11-23T02:27:35.5752197Z 2022-11-23T02:27:35.5753079Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_comm_hooks 2022-11-23T02:27:35.5754193Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_comm_hooks (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_comm_hooks_nmm2qg5d) 2022-11-23T02:27:35.5756290Z 2022-11-23T02:27:35.5756941Z Running tests... 2022-11-23T02:27:35.5757482Z ---------------------------------------------------------------------- 2022-11-23T02:27:35.5758348Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_comm_hooks 2022-11-23T02:27:35.5759002Z test_bf16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:27:35.5759557Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58158 2022-11-23T02:27:35.5760019Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58159 2022-11-23T02:27:35.5760664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5761132Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5761704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5762187Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5762794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5763256Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5763831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5764315Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5764785Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.5765270Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.5765945Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5766656Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5767188Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.5767654Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.5768030Z dist init r=1, world=2 2022-11-23T02:27:35.5768293Z dist init r=0, world=2 2022-11-23T02:27:35.5768532Z ok (6.040s) 2022-11-23T02:27:35.5769041Z test_bf16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58241 2022-11-23T02:27:35.5769657Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58242 2022-11-23T02:27:35.5770282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5770750Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5771322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5771808Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5772510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5772959Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5773548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5774032Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5774499Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.5775042Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.5775852Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5776564Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5777117Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.5777584Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.5777954Z dist init r=0, world=2 2022-11-23T02:27:35.5778213Z dist init r=1, world=2 2022-11-23T02:27:35.5778453Z ok (4.412s) 2022-11-23T02:27:35.5778962Z test_bf16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58324 2022-11-23T02:27:35.5779584Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58325 2022-11-23T02:27:35.5780220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5780663Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5781254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5781739Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5782337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5782770Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5783359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5783837Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5784328Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.5784826Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.5785608Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5786318Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5786846Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.5787308Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.5787664Z dist init r=1, world=2 2022-11-23T02:27:35.5787913Z dist init r=0, world=2 2022-11-23T02:27:35.5788136Z ok (4.412s) 2022-11-23T02:27:35.5788652Z test_bf16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58407 2022-11-23T02:27:35.5789667Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58408 2022-11-23T02:27:35.5790386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5790838Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5791425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5791902Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5792473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5792927Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5793505Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5794054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5794496Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.5795003Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.5795673Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5796353Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5796884Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.5797366Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.5797725Z dist init r=1, world=2 2022-11-23T02:27:35.5797963Z dist init r=0, world=2 2022-11-23T02:27:35.5798203Z ok (4.412s) 2022-11-23T02:27:35.5798718Z test_bf16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58490 2022-11-23T02:27:35.5799313Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58491 2022-11-23T02:27:35.5799934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5800389Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5800970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5801428Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5802011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5802463Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5803042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5803506Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5803961Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.5804461Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.5805111Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5805808Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5806624Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.5807115Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.5807460Z dist init r=1, world=2 2022-11-23T02:27:35.5807711Z dist init r=0, world=2 2022-11-23T02:27:35.5807949Z ok (4.512s) 2022-11-23T02:27:35.5808507Z test_bf16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58573 2022-11-23T02:27:35.5809140Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58574 2022-11-23T02:27:35.5809766Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5810230Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5810799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5811327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5811916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5812377Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5812937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5813407Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5813867Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.5814347Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.5815014Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5815717Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5816248Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.5816713Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.5817064Z dist init r=0, world=2 2022-11-23T02:27:35.5817317Z dist init r=1, world=2 2022-11-23T02:27:35.5817538Z ok (4.412s) 2022-11-23T02:27:35.5817958Z test_default_communication_hook_behavior_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:27:35.5818719Z Tests FSDP's default communication hook's behavior and correctness. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58656 2022-11-23T02:27:35.5819274Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58657 2022-11-23T02:27:35.5819872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5820333Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5820918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5821405Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5821978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5822428Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5823005Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5823461Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5823918Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.5824428Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.5825098Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5825833Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5826371Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.5826849Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.5827189Z dist init r=1, world=2 2022-11-23T02:27:35.5827444Z dist init r=0, world=2 2022-11-23T02:27:35.5827683Z ok (4.413s) 2022-11-23T02:27:35.5828102Z test_default_communication_hook_behavior_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:27:35.5828844Z Tests FSDP's default communication hook's behavior and correctness. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58739 2022-11-23T02:27:35.5829910Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58740 2022-11-23T02:27:35.5830546Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5830986Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5831565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5832038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5832615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5833044Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5833623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5834096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5834557Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.5835043Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.5835709Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5836406Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5836920Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.5837403Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.5837764Z dist init r=0, world=2 2022-11-23T02:27:35.5838015Z dist init r=1, world=2 2022-11-23T02:27:35.5838238Z ok (4.412s) 2022-11-23T02:27:35.5838662Z test_default_communication_hook_behavior_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-11-23T02:27:35.5839423Z Tests FSDP's default communication hook's behavior and correctness. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58822 2022-11-23T02:27:35.5839960Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58823 2022-11-23T02:27:35.5840575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5841031Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5841610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5842075Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5842666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5843117Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5843759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5844253Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5844717Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.5845222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.5845877Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5846574Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5847172Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.5847653Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.5847999Z dist init r=0, world=2 2022-11-23T02:27:35.5848253Z dist init r=1, world=2 2022-11-23T02:27:35.5848496Z ok (4.512s) 2022-11-23T02:27:35.5848938Z test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:27:35.5849698Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58905 2022-11-23T02:27:35.5850228Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58906 2022-11-23T02:27:35.5850839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5851284Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5851867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5852345Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5852915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5853365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5853943Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5854418Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5854861Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.5855363Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.5856039Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5856741Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5857255Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.5857734Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.5858331Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5858705Z return func(*args, **kwargs) 2022-11-23T02:27:35.5859242Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5859636Z _check_comm_hook( 2022-11-23T02:27:35.5860150Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:27:35.5860612Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:27:35.5861227Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5861624Z traceback.print_stack() 2022-11-23T02:27:35.5862111Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5862495Z return func(*args, **kwargs) 2022-11-23T02:27:35.5863029Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5863423Z _check_comm_hook( 2022-11-23T02:27:35.5863919Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:27:35.5864467Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:27:35.5865031Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5865397Z traceback.print_stack() 2022-11-23T02:27:35.5865904Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5866288Z return func(*args, **kwargs) 2022-11-23T02:27:35.5866819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5867195Z _check_comm_hook( 2022-11-23T02:27:35.5867709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:27:35.5868086Z p_assert( 2022-11-23T02:27:35.5868544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5868927Z traceback.print_stack() 2022-11-23T02:27:35.5869797Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5870181Z return func(*args, **kwargs) 2022-11-23T02:27:35.5870718Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5871094Z _check_comm_hook( 2022-11-23T02:27:35.5871604Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:27:35.5871983Z p_assert( 2022-11-23T02:27:35.5872440Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5872822Z traceback.print_stack() 2022-11-23T02:27:35.5873091Z dist init r=0, world=2 2022-11-23T02:27:35.5873357Z Communication hook should not be `None` 2022-11-23T02:27:35.5873688Z Communication hook state should not be `None` 2022-11-23T02:27:35.5873982Z dist init r=1, world=2 2022-11-23T02:27:35.5874246Z Communication hook should not be `None` 2022-11-23T02:27:35.5874574Z Communication hook state should not be `None` 2022-11-23T02:27:35.5874857Z ok (4.413s) 2022-11-23T02:27:35.5875318Z test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:27:35.5876097Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 58988 2022-11-23T02:27:35.5876632Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 58989 2022-11-23T02:27:35.5877249Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5877689Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5878272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5878751Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5879339Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5879844Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5880442Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5880915Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5881377Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.5881866Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.5882534Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5883305Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5883818Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.5884301Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.5884894Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5885286Z return func(*args, **kwargs) 2022-11-23T02:27:35.5885802Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5886195Z _check_comm_hook( 2022-11-23T02:27:35.5886706Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:27:35.5887166Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:27:35.5887737Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5888119Z traceback.print_stack() 2022-11-23T02:27:35.5888628Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5888998Z return func(*args, **kwargs) 2022-11-23T02:27:35.5889531Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5889921Z _check_comm_hook( 2022-11-23T02:27:35.5890417Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:27:35.5890892Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:27:35.5891456Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5891849Z traceback.print_stack() 2022-11-23T02:27:35.5892332Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5892715Z return func(*args, **kwargs) 2022-11-23T02:27:35.5893255Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5893630Z _check_comm_hook( 2022-11-23T02:27:35.5894143Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:27:35.5894522Z p_assert( 2022-11-23T02:27:35.5894993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5895363Z traceback.print_stack() 2022-11-23T02:27:35.5895868Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5896257Z return func(*args, **kwargs) 2022-11-23T02:27:35.5896774Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5897163Z _check_comm_hook( 2022-11-23T02:27:35.5897729Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:27:35.5898099Z p_assert( 2022-11-23T02:27:35.5898571Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5898953Z traceback.print_stack() 2022-11-23T02:27:35.5899222Z dist init r=1, world=2 2022-11-23T02:27:35.5899489Z Communication hook should not be `None` 2022-11-23T02:27:35.5899822Z Communication hook state should not be `None` 2022-11-23T02:27:35.5900119Z dist init r=0, world=2 2022-11-23T02:27:35.5900388Z Communication hook should not be `None` 2022-11-23T02:27:35.5900717Z Communication hook state should not be `None` 2022-11-23T02:27:35.5901055Z ok (4.412s) 2022-11-23T02:27:35.5901501Z test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-11-23T02:27:35.5902270Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59071 2022-11-23T02:27:35.5902805Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59072 2022-11-23T02:27:35.5903423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5903860Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5904439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5904918Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5905488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5905943Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5906524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5906997Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5907441Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.5907947Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.5908612Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5909621Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5910147Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.5910629Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.5911232Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5911602Z return func(*args, **kwargs) 2022-11-23T02:27:35.5912137Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5912526Z _check_comm_hook( 2022-11-23T02:27:35.5913036Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:27:35.5913491Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:27:35.5914053Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5914440Z traceback.print_stack() 2022-11-23T02:27:35.5914922Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5915305Z return func(*args, **kwargs) 2022-11-23T02:27:35.5915909Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5916309Z _check_comm_hook( 2022-11-23T02:27:35.5916802Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:27:35.5917279Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:27:35.5917838Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5918207Z traceback.print_stack() 2022-11-23T02:27:35.5918707Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5919155Z return func(*args, **kwargs) 2022-11-23T02:27:35.5919690Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5920061Z _check_comm_hook( 2022-11-23T02:27:35.5920576Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:27:35.5920957Z p_assert( 2022-11-23T02:27:35.5921411Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5921795Z traceback.print_stack() 2022-11-23T02:27:35.5922292Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5922654Z return func(*args, **kwargs) 2022-11-23T02:27:35.5923190Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5923583Z _check_comm_hook( 2022-11-23T02:27:35.5924094Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:27:35.5924454Z p_assert( 2022-11-23T02:27:35.5924931Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5925310Z traceback.print_stack() 2022-11-23T02:27:35.5925561Z dist init r=0, world=2 2022-11-23T02:27:35.5925847Z Communication hook should not be `None` 2022-11-23T02:27:35.5926178Z Communication hook state should not be `None` 2022-11-23T02:27:35.5926454Z dist init r=1, world=2 2022-11-23T02:27:35.5926740Z Communication hook should not be `None` 2022-11-23T02:27:35.5927065Z Communication hook state should not be `None` 2022-11-23T02:27:35.5927329Z ok (4.412s) 2022-11-23T02:27:35.5927792Z test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:27:35.5928558Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59154 2022-11-23T02:27:35.5929089Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59155 2022-11-23T02:27:35.5929691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5930147Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5930729Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5931206Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5931774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5932226Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5932809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5933263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5933776Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.5934292Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.5934962Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5935642Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5936174Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.5936656Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.5937311Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5937683Z return func(*args, **kwargs) 2022-11-23T02:27:35.5938221Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5938616Z _check_comm_hook( 2022-11-23T02:27:35.5939110Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:27:35.5939590Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:27:35.5940152Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5940535Z traceback.print_stack() 2022-11-23T02:27:35.5941019Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5941405Z return func(*args, **kwargs) 2022-11-23T02:27:35.5941937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5942311Z _check_comm_hook( 2022-11-23T02:27:35.5942823Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:27:35.5943302Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:27:35.5943865Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5944230Z traceback.print_stack() 2022-11-23T02:27:35.5944731Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5945114Z return func(*args, **kwargs) 2022-11-23T02:27:35.5945630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5946023Z _check_comm_hook( 2022-11-23T02:27:35.5946538Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:27:35.5946898Z p_assert( 2022-11-23T02:27:35.5947370Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5947749Z traceback.print_stack() 2022-11-23T02:27:35.5948248Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5948617Z return func(*args, **kwargs) 2022-11-23T02:27:35.5949512Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5949911Z _check_comm_hook( 2022-11-23T02:27:35.5950411Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:27:35.5950795Z p_assert( 2022-11-23T02:27:35.5951267Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5951649Z traceback.print_stack() 2022-11-23T02:27:35.5951900Z dist init r=1, world=2 2022-11-23T02:27:35.5952261Z Communication hook should not be `None` 2022-11-23T02:27:35.5952602Z Communication hook state should not be `None` 2022-11-23T02:27:35.5952876Z dist init r=0, world=2 2022-11-23T02:27:35.5953164Z Communication hook should not be `None` 2022-11-23T02:27:35.5953491Z Communication hook state should not be `None` 2022-11-23T02:27:35.5953759Z ok (4.513s) 2022-11-23T02:27:35.5954213Z test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:27:35.5954970Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59237 2022-11-23T02:27:35.5955563Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59238 2022-11-23T02:27:35.5956182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5956641Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5957231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5957694Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5958282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5958736Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5959318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5959775Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5960244Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.5960753Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.5961412Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5962112Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5962644Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.5963124Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.5963705Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5964099Z return func(*args, **kwargs) 2022-11-23T02:27:35.5964642Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5965018Z _check_comm_hook( 2022-11-23T02:27:35.5965531Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:27:35.5966008Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:27:35.5966575Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5966940Z traceback.print_stack() 2022-11-23T02:27:35.5967442Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5967830Z return func(*args, **kwargs) 2022-11-23T02:27:35.5968344Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5968743Z _check_comm_hook( 2022-11-23T02:27:35.5969250Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:27:35.5969726Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:27:35.5970320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5970717Z traceback.print_stack() 2022-11-23T02:27:35.5971223Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5971588Z return func(*args, **kwargs) 2022-11-23T02:27:35.5972123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5972512Z _check_comm_hook( 2022-11-23T02:27:35.5973021Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:27:35.5973434Z p_assert( 2022-11-23T02:27:35.5973905Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5974285Z traceback.print_stack() 2022-11-23T02:27:35.5974771Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5975153Z return func(*args, **kwargs) 2022-11-23T02:27:35.5975682Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5976118Z _check_comm_hook( 2022-11-23T02:27:35.5976610Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:27:35.5976983Z p_assert( 2022-11-23T02:27:35.5977457Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5977820Z traceback.print_stack() 2022-11-23T02:27:35.5978093Z dist init r=1, world=2 2022-11-23T02:27:35.5978374Z Communication hook should not be `None` 2022-11-23T02:27:35.5978688Z Communication hook state should not be `None` 2022-11-23T02:27:35.5978980Z dist init r=0, world=2 2022-11-23T02:27:35.5979268Z Communication hook should not be `None` 2022-11-23T02:27:35.5979579Z Communication hook state should not be `None` 2022-11-23T02:27:35.5979860Z ok (4.413s) 2022-11-23T02:27:35.5980320Z test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-11-23T02:27:35.5981079Z Tests FSDP's communication hook interface behavior. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59320 2022-11-23T02:27:35.5981593Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59321 2022-11-23T02:27:35.5982209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5982667Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5983231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5983717Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5984305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.5984758Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.5985320Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.5985792Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.5986251Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.5986759Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.5987410Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5988229Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.5988774Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.5989547Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.5990152Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5990540Z return func(*args, **kwargs) 2022-11-23T02:27:35.5991072Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5991540Z _check_comm_hook( 2022-11-23T02:27:35.5992060Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:27:35.5992539Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:27:35.5993090Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5993473Z traceback.print_stack() 2022-11-23T02:27:35.5993970Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5994358Z return func(*args, **kwargs) 2022-11-23T02:27:35.5994870Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5995260Z _check_comm_hook( 2022-11-23T02:27:35.5995770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 652, in _check_comm_hook 2022-11-23T02:27:35.5996233Z p_assert(comm_hook is not None, "Communication hook should not be `None`") 2022-11-23T02:27:35.5996794Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.5997181Z traceback.print_stack() 2022-11-23T02:27:35.5997685Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.5998051Z return func(*args, **kwargs) 2022-11-23T02:27:35.5998582Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.5998968Z _check_comm_hook( 2022-11-23T02:27:35.5999457Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:27:35.5999833Z p_assert( 2022-11-23T02:27:35.6000303Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.6000674Z traceback.print_stack() 2022-11-23T02:27:35.6001175Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:27:35.6001556Z return func(*args, **kwargs) 2022-11-23T02:27:35.6002094Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 515, in _post_backward_hook 2022-11-23T02:27:35.6002466Z _check_comm_hook( 2022-11-23T02:27:35.6002978Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 653, in _check_comm_hook 2022-11-23T02:27:35.6003352Z p_assert( 2022-11-23T02:27:35.6003803Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:27:35.6004190Z traceback.print_stack() 2022-11-23T02:27:35.6004457Z dist init r=0, world=2 2022-11-23T02:27:35.6004723Z Communication hook should not be `None` 2022-11-23T02:27:35.6005053Z Communication hook state should not be `None` 2022-11-23T02:27:35.6005347Z dist init r=1, world=2 2022-11-23T02:27:35.6005629Z Communication hook should not be `None` 2022-11-23T02:27:35.6005940Z Communication hook state should not be `None` 2022-11-23T02:27:35.6006220Z ok (4.413s) 2022-11-23T02:27:35.6006808Z test_fp16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59403 2022-11-23T02:27:35.6007419Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59404 2022-11-23T02:27:35.6008043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6008498Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6009081Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6009542Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6010182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6010635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6011201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6011676Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6012138Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.6012646Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.6013298Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6013995Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6014530Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.6015014Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.6015360Z dist init r=1, world=2 2022-11-23T02:27:35.6015613Z dist init r=0, world=2 2022-11-23T02:27:35.6015860Z ok (4.411s) 2022-11-23T02:27:35.6016358Z test_fp16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59486 2022-11-23T02:27:35.6016972Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59487 2022-11-23T02:27:35.6017592Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6018051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6018623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6019098Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6019689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6020123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6020706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6021179Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6021641Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.6022131Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.6022801Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6023504Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6024083Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.6024551Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.6024899Z dist init r=1, world=2 2022-11-23T02:27:35.6025149Z dist init r=0, world=2 2022-11-23T02:27:35.6025374Z ok (4.411s) 2022-11-23T02:27:35.6025893Z test_fp16_hook_has_wrapping_False_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59569 2022-11-23T02:27:35.6026507Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59570 2022-11-23T02:27:35.6027162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6027620Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6028210Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6028689Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6029607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6030073Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6030652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6031126Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6031568Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.6032077Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.6032752Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6033431Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6033963Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.6034446Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.6034807Z dist init r=1, world=2 2022-11-23T02:27:35.6035045Z dist init r=0, world=2 2022-11-23T02:27:35.6035285Z ok (4.412s) 2022-11-23T02:27:35.6035805Z test_fp16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59652 2022-11-23T02:27:35.6036402Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59653 2022-11-23T02:27:35.6037025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6037486Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6038067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6038528Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6039115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6039564Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6040142Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6040606Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6041065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.6041646Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.6042310Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6043010Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6043541Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.6044025Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.6044369Z dist init r=0, world=2 2022-11-23T02:27:35.6044695Z dist init r=1, world=2 2022-11-23T02:27:35.6044935Z ok (4.412s) 2022-11-23T02:27:35.6045433Z test_fp16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59735 2022-11-23T02:27:35.6046047Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59736 2022-11-23T02:27:35.6046664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6047121Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6047686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6048161Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6048742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6049182Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6049759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6050232Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6050692Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.6051180Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.6051846Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6052542Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6053073Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.6053538Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.6053896Z dist init r=0, world=2 2022-11-23T02:27:35.6054149Z dist init r=1, world=2 2022-11-23T02:27:35.6054371Z ok (4.411s) 2022-11-23T02:27:35.6054899Z test_fp16_hook_has_wrapping_True_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59818 2022-11-23T02:27:35.6055519Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59819 2022-11-23T02:27:35.6056189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6056628Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6057206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6057687Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6058256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6058762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6059355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6059829Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6060274Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.6060779Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.6061445Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6062198Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6062710Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.6063193Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.6063551Z dist init r=0, world=2 2022-11-23T02:27:35.6063785Z dist init r=1, world=2 2022-11-23T02:27:35.6064028Z ok (4.411s) 2022-11-23T02:27:35.6064429Z test_registering_hook_non_root_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:27:35.6065135Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59901 2022-11-23T02:27:35.6065682Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59902 2022-11-23T02:27:35.6066297Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6066759Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6067322Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6067804Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6068388Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6068837Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6069692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6070168Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6070632Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.6071125Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.6071791Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6072495Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6073036Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.6073498Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.6073849Z dist init r=1, world=2 2022-11-23T02:27:35.6074102Z dist init r=0, world=2 2022-11-23T02:27:35.6074327Z ok (4.011s) 2022-11-23T02:27:35.6074726Z test_registering_hook_non_root_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:27:35.6075446Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 59980 2022-11-23T02:27:35.6076028Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 59981 2022-11-23T02:27:35.6076701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6077171Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6077758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6078237Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6078808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6079260Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6079839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6080362Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6080821Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.6081332Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.6081997Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6082679Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6083209Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.6083686Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.6084030Z dist init r=1, world=2 2022-11-23T02:27:35.6084286Z dist init r=0, world=2 2022-11-23T02:27:35.6084528Z ok (4.011s) 2022-11-23T02:27:35.6084936Z test_registering_hook_non_root_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-11-23T02:27:35.6085648Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60059 2022-11-23T02:27:35.6086192Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60060 2022-11-23T02:27:35.6086807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6087246Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6087826Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6088301Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6088890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6089326Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6089906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6090375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6090836Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.6091322Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.6091986Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6092687Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6093201Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.6093684Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.6094038Z dist init r=0, world=2 2022-11-23T02:27:35.6094343Z dist init r=1, world=2 2022-11-23T02:27:35.6094573Z ok (3.911s) 2022-11-23T02:27:35.6094986Z test_registering_hook_submodules_sharding_strategy_ShardingStrategy_FULL_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:27:35.6095717Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60138 2022-11-23T02:27:35.6096240Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60139 2022-11-23T02:27:35.6096852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6097357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6097935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6098392Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6098982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6099434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6099994Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6100469Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6100925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.6101429Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.6102085Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6102786Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6103316Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.6103797Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.6104140Z dist init r=1, world=2 2022-11-23T02:27:35.6104393Z dist init r=0, world=2 2022-11-23T02:27:35.6104634Z ok (3.913s) 2022-11-23T02:27:35.6105027Z test_registering_hook_submodules_sharding_strategy_ShardingStrategy_NO_SHARD (__main__.TestCommunicationHooks) 2022-11-23T02:27:35.6105756Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60217 2022-11-23T02:27:35.6106300Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60218 2022-11-23T02:27:35.6106914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6107358Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6107940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6108419Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6109337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6109808Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6110392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6110872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6111313Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.6111891Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.6112571Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6113252Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6113785Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.6114263Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.6114622Z dist init r=0, world=2 2022-11-23T02:27:35.6114858Z dist init r=1, world=2 2022-11-23T02:27:35.6115102Z ok (4.012s) 2022-11-23T02:27:35.6115597Z test_registering_hook_submodules_sharding_strategy_ShardingStrategy_SHARD_GRAD_OP (__main__.TestCommunicationHooks) 2022-11-23T02:27:35.6116314Z Tests FSDP's communication hook registering for submodules. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60296 2022-11-23T02:27:35.6116860Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60297 2022-11-23T02:27:35.6117473Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6117931Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6118493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6118969Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6119551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:27:35.6120002Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:27:35.6120563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:27:35.6121038Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:27:35.6121500Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:27:35.6121989Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:27:35.6122655Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6123353Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:27:35.6123882Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:27:35.6124349Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:27:35.6124706Z dist init r=0, world=2 2022-11-23T02:27:35.6124959Z dist init r=1, world=2 2022-11-23T02:27:35.6125182Z ok (3.911s) 2022-11-23T02:27:35.6125336Z 2022-11-23T02:27:35.6125611Z ---------------------------------------------------------------------- 2022-11-23T02:27:35.6125947Z Ran 27 tests in 118.352s 2022-11-23T02:27:35.6126115Z 2022-11-23T02:27:35.6126213Z OK 2022-11-23T02:27:35.6126329Z 2022-11-23T02:27:35.6126454Z Generating XML reports... 2022-11-23T02:27:35.6127084Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_comm_hooks/TEST-TestCommunicationHooks-20221123022536.xml 2022-11-23T02:27:35.6127463Z 2022-11-23T02:27:35.6127920Z ##[endgroup] 2022-11-23T02:27:35.6128521Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_comm_hooks (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_comm_hooks_nmm2qg5d) 2022-11-23T02:27:35.6128899Z 2022-11-23T02:27:35.6129206Z Running distributed/optim/test_zero_redundancy_optimizer ... [2022-11-23 02:27:35.575972] 2022-11-23T02:27:35.6130002Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/optim/test_zero_redundancy_optimizer.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:27:35.576241] 2022-11-23T02:30:35.7930380Z 2022-11-23T02:30:35.7931161Z Expand the folded group to see the log file of distributed/optim/test_zero_redundancy_optimizer 2022-11-23T02:30:35.7932268Z ##[group]PRINTING LOG FILE of distributed/optim/test_zero_redundancy_optimizer (/var/lib/jenkins/workspace/test/test-reports/distributed-optim-test_zero_redundancy_optimizer_pk2frmsx) 2022-11-23T02:30:35.7934525Z 2022-11-23T02:30:35.7934757Z Running tests... 2022-11-23T02:30:35.7937948Z ---------------------------------------------------------------------- 2022-11-23T02:30:35.7939102Z Test results will be stored in test-reports/python-unittest/distributed.optim.test_zero_redundancy_optimizer 2022-11-23T02:30:35.7940193Z test_add_param_group (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.7940745Z Check that ZeroRedundancyOptimizer properly handles adding a new ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:30:35.7941865Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/67287 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.610s) 2022-11-23T02:30:35.7942672Z test_collect_shards (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.7943253Z Check the state consolidation mechanism and the state dict exposed ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60410 2022-11-23T02:30:35.7944105Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60411 2022-11-23T02:30:35.7944765Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.7945232Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.7945812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.7948127Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.7949344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.7950030Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.7950655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.7951128Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.7951583Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.7952081Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.7952577Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.7953106Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.7953793Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.7954504Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.7954911Z ok (5.717s) 2022-11-23T02:30:35.7955415Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_False_static_graph_False_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.7956154Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60494 2022-11-23T02:30:35.7956723Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60495 2022-11-23T02:30:35.7957537Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.7958008Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.7958607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.7959099Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.7959694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.7960133Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.7960824Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.7961307Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.7961778Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.7962265Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.7962767Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.7963263Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.7963923Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.7964635Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.7965569Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.7966647Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.7967334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.7967814Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.7968305Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.7968790Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.7969124Z ok (4.813s) 2022-11-23T02:30:35.7969646Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_False_static_graph_False_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.7970368Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60607 2022-11-23T02:30:35.7971271Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60608 2022-11-23T02:30:35.7971890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.7972356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.7972941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.7973479Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.7974057Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.7974516Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.7975166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.7975653Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.7976080Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.7976558Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.7977055Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.7977540Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.7978222Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.7978999Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.7979924Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.7980993Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.7981662Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.7982154Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.7982640Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.7983113Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.7983467Z ok (4.713s) 2022-11-23T02:30:35.7983987Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_False_static_graph_True_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.7984704Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60720 2022-11-23T02:30:35.7985241Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60721 2022-11-23T02:30:35.7985861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.7986319Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.7986903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.7987369Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.7987965Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.7988416Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.7989448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.7989981Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.7990436Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.7990947Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.7991424Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.7991920Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.7992597Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.7993403Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.7994325Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.7995402Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.7996089Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.7996662Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.7997130Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.7997625Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.7997983Z ok (4.713s) 2022-11-23T02:30:35.7998481Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_False_static_graph_True_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.7999195Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60833 2022-11-23T02:30:35.7999752Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60834 2022-11-23T02:30:35.8000379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8000828Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8001411Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8001890Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8002476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8002908Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8003484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8003955Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8004381Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8004866Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8005359Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8005866Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8006515Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8007219Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8008142Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8009217Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8009907Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8010506Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8011008Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8011495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8011827Z ok (4.812s) 2022-11-23T02:30:35.8012344Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_True_static_graph_False_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8013057Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 60946 2022-11-23T02:30:35.8013673Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 60947 2022-11-23T02:30:35.8014281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8014745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8015328Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8015812Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8016384Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8016832Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8017410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8017872Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8018315Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8018820Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8019313Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8019785Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8020452Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8021151Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8022078Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8023136Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8023829Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8024328Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8024820Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8025291Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8025640Z ok (4.813s) 2022-11-23T02:30:35.8026152Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_True_static_graph_False_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8026853Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61059 2022-11-23T02:30:35.8027464Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61060 2022-11-23T02:30:35.8028099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8028566Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8029647Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8030150Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8030750Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8031305Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8031873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8032352Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8032807Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8033276Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8033771Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8034273Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8034942Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8035627Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8036559Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8037628Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8038307Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8038785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8039281Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8039766Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8040126Z ok (4.836s) 2022-11-23T02:30:35.8040621Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_True_static_graph_True_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8041343Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61172 2022-11-23T02:30:35.8041901Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61173 2022-11-23T02:30:35.8042520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8042960Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8043550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8044026Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8044597Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8045048Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8045692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8046177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8046603Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8047105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8047603Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8048078Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8048804Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8049509Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8050442Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8051502Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8052166Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8052663Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8053157Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8053628Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8053981Z ok (4.813s) 2022-11-23T02:30:35.8054494Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_False_gradient_as_bucket_view_True_static_graph_True_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8055210Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61285 2022-11-23T02:30:35.8055749Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61286 2022-11-23T02:30:35.8056372Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8056835Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8057425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8057886Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8058471Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8058922Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8059486Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8059960Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8060403Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8060902Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8061387Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8061879Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8062620Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8063336Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8064246Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8065308Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8066051Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8066546Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8067023Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8067509Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8067863Z ok (4.813s) 2022-11-23T02:30:35.8068388Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_False_static_graph_False_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8069580Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61398 2022-11-23T02:30:35.8070158Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61399 2022-11-23T02:30:35.8070799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8071243Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8071832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8072309Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8072894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8073371Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8073959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8074436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8074896Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8075362Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8075855Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8076356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8077005Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8077704Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8078628Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8079689Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8080469Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8080966Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8081458Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8081949Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8082283Z ok (4.713s) 2022-11-23T02:30:35.8082794Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_False_static_graph_False_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8083513Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61511 2022-11-23T02:30:35.8084149Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61512 2022-11-23T02:30:35.8084762Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8085226Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8085806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8086283Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8086849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8087300Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8087873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8088336Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8088776Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8089276Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8089768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8090239Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8090902Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8091601Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8092529Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8093583Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8094263Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8094756Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8095246Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8095715Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8096064Z ok (4.812s) 2022-11-23T02:30:35.8096577Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_False_static_graph_True_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8097357Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61624 2022-11-23T02:30:35.8097903Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61625 2022-11-23T02:30:35.8098528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8098990Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8099561Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8100044Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8100627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8101139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8101703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8102183Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8102630Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8103109Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8103603Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8104094Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8104764Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8105454Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8106381Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8107443Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8108120Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8108599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8109656Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8110174Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8110531Z ok (4.712s) 2022-11-23T02:30:35.8111030Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_False_static_graph_True_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8111745Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61737 2022-11-23T02:30:35.8112298Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61738 2022-11-23T02:30:35.8112929Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8113375Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8113961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8114449Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8115019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8115562Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8116167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8116650Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8117081Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8117565Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8118072Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8118649Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8119301Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8120010Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8120931Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8121997Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8122667Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8123165Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8123656Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8124146Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8124482Z ok (4.713s) 2022-11-23T02:30:35.8124995Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_True_static_graph_False_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8125712Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61850 2022-11-23T02:30:35.8126250Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61851 2022-11-23T02:30:35.8126873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8127339Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8127926Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8128391Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8128981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8129432Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8130012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8130466Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8130911Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8131392Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8131874Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8132379Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8133100Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8133812Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8134717Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8135784Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8136529Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8137029Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8137499Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8137985Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8138342Z ok (4.713s) 2022-11-23T02:30:35.8138857Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_True_static_graph_False_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8139551Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 61963 2022-11-23T02:30:35.8140114Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 61964 2022-11-23T02:30:35.8140741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8141209Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8141774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8142246Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8142833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8143266Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8143840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8144305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8144758Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8145234Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8145726Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8146223Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8146868Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8147575Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8148496Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8149986Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8150687Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8151167Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8151657Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8152144Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8152474Z ok (4.713s) 2022-11-23T02:30:35.8152985Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_True_static_graph_True_shard_buckets_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8153775Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62076 2022-11-23T02:30:35.8154327Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62077 2022-11-23T02:30:35.8154938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8155402Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8155991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8156465Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8157036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8157493Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8158075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8158530Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8158977Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8159454Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8159948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8160430Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8161095Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8161791Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8162715Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8163768Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8164449Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8164939Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8165431Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8165898Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8166252Z ok (4.713s) 2022-11-23T02:30:35.8166762Z test_ddp_zero_overlap_use_gpu_True_use_interleaved_hook_True_gradient_as_bucket_view_True_static_graph_True_shard_buckets_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8167529Z Check that overlapping DDP with ZeRO using the given method determined ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62189 2022-11-23T02:30:35.8168121Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62190 2022-11-23T02:30:35.8168781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8169267Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8169867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8170367Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8171054Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8171543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8172145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8172650Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8173123Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8173681Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8174193Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8174733Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8175443Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8176169Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8177147Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8178267Z INFO:torch.distributed.optim.zero_redundancy_optimizer:Using the functional optimizer instead of since `overlap_with_ddp=True` 2022-11-23T02:30:35.8178986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8179506Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8180014Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8180536Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8180908Z ok (4.813s) 2022-11-23T02:30:35.8181348Z test_local_optimizer_parity_optimizer_class_str_AdamW_maximize_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8182021Z When combined with DDP, check that a local optimizer gives the same ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62302 2022-11-23T02:30:35.8182596Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62303 2022-11-23T02:30:35.8183253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8183725Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8184343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8184855Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8185456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8186009Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8186638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8187148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8187599Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8188117Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8188643Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8189607Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8190305Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8191055Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8191645Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfak4vuev 2022-11-23T02:30:35.8192201Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfak4vuev/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8192767Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7dhb55ha 2022-11-23T02:30:35.8193335Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7dhb55ha/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8194495Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:30:35.8196272Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:30:35.8197521Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8198340Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8199157Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8199961Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8200779Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8201594Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8202493Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8203325Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8204117Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8204930Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8205820Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8206638Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8207434Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8208243Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8209050Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8209865Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8210377Z ok (5.116s) 2022-11-23T02:30:35.8210815Z test_local_optimizer_parity_optimizer_class_str_AdamW_maximize_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8211484Z When combined with DDP, check that a local optimizer gives the same ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62385 2022-11-23T02:30:35.8212064Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62386 2022-11-23T02:30:35.8212712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8213203Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8213815Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8214329Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8214933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8215419Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8216032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8216534Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8216989Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8217500Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8218025Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8218540Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8219246Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8220045Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8220649Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp69kyzzlt 2022-11-23T02:30:35.8221203Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp69kyzzlt/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8221776Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3mmtqidb 2022-11-23T02:30:35.8222350Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3mmtqidb/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8223518Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:30:35.8225289Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:30:35.8226535Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8227343Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8228159Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8229340Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8230173Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8230989Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8231799Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8232617Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8233426Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8234238Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8235035Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8235930Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8236755Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8237566Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8238385Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8239179Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8239778Z ok (5.116s) 2022-11-23T02:30:35.8240244Z test_local_optimizer_parity_optimizer_class_str_Adam_maximize_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8240912Z When combined with DDP, check that a local optimizer gives the same ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62468 2022-11-23T02:30:35.8241473Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62469 2022-11-23T02:30:35.8242144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8242634Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8243233Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8243741Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8244362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8244852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8245446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8245954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8246427Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8246940Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8247505Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8248043Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8248757Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8249489Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8250074Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy0cpqkau 2022-11-23T02:30:35.8250649Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy0cpqkau/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8251218Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4lsnkiew 2022-11-23T02:30:35.8251775Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4lsnkiew/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8252989Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:30:35.8254698Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:30:35.8255944Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8256824Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8257639Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8258457Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8259253Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8260064Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8260886Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8261693Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8262487Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8263293Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8264102Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8264914Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8265729Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8266518Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8267325Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8268135Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8268652Z ok (5.116s) 2022-11-23T02:30:35.8269258Z test_local_optimizer_parity_optimizer_class_str_Adam_maximize_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8270017Z When combined with DDP, check that a local optimizer gives the same ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62551 2022-11-23T02:30:35.8270610Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62552 2022-11-23T02:30:35.8271276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8271751Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8272369Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8272876Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8273602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8274097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8274722Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8275234Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8275688Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8276195Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8276715Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8277246Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8277944Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8278692Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8279286Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt7zi5joq 2022-11-23T02:30:35.8279841Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt7zi5joq/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8280408Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjptto6f7 2022-11-23T02:30:35.8280983Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjptto6f7/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8282142Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:30:35.8283843Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:30:35.8285095Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8285914Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8286770Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8287599Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8288417Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8289238Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8290091Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8290915Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8291726Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8292539Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8293346Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8294140Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8294950Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8295757Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8296564Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8297373Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8297872Z ok (5.116s) 2022-11-23T02:30:35.8298327Z test_local_optimizer_parity_optimizer_class_str_SGD_maximize_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8298993Z When combined with DDP, check that a local optimizer gives the same ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62634 2022-11-23T02:30:35.8299558Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62635 2022-11-23T02:30:35.8300218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8300710Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8301324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8301817Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8302444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8302931Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8303601Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8304120Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8304597Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8305125Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8305624Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8306145Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8306914Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8307666Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8308243Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvfnx6sob 2022-11-23T02:30:35.8308825Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvfnx6sob/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8309743Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppa7w6pax 2022-11-23T02:30:35.8310299Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppa7w6pax/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8311463Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:30:35.8313157Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:30:35.8314399Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8315219Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8316038Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8316858Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8317654Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8318461Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8319279Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8320166Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8320995Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8321791Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8322607Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8323488Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8324312Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8325128Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8325920Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8326726Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8327238Z ok (5.116s) 2022-11-23T02:30:35.8327692Z test_local_optimizer_parity_optimizer_class_str_SGD_maximize_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8328341Z When combined with DDP, check that a local optimizer gives the same ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62717 2022-11-23T02:30:35.8328920Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62718 2022-11-23T02:30:35.8329586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8330059Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8330681Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8331184Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8331809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8332276Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8332894Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8333400Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8333856Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8334370Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8334887Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8335418Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8336107Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8336859Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8337544Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl8jqmqg3 2022-11-23T02:30:35.8338128Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl8jqmqg3/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8338677Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp54hgccuy 2022-11-23T02:30:35.8339244Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp54hgccuy/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8340403Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:30:35.8342172Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:30:35.8343428Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8344256Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8345058Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8345873Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8346679Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8347491Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8348300Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8349261Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8350087Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8350902Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8351711Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8352532Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8353398Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8354232Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8355045Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8355862Z WARNING:torch.distributed.optim.zero_redundancy_optimizer:ZeroRedundancyOptimizer detected that the trainable parameters changed; rebuilding the parameter buckets if enabled 2022-11-23T02:30:35.8356423Z ok (5.038s) 2022-11-23T02:30:35.8356797Z test_lr_scheduler (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8357377Z Check that a normal PyTorch ``lr_scheduler`` is usable with ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62800 2022-11-23T02:30:35.8357948Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62801 2022-11-23T02:30:35.8358600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8359093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8359716Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8360205Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8360832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8361323Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8361935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8362428Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8362903Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8363413Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8363921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8364453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8365159Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8365913Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8366315Z ok (5.314s) 2022-11-23T02:30:35.8366703Z test_multiple_param_groups (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8367333Z Check parity between constructing ZeRO with multiple parameter groups ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62884 2022-11-23T02:30:35.8367926Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62885 2022-11-23T02:30:35.8368564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8369047Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8369662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8370154Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8370785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8371272Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8371947Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8372445Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8372916Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8373473Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8373982Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8374515Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8375287Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8376037Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8376450Z ok (5.917s) 2022-11-23T02:30:35.8376850Z test_nondefault_process_group (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8377626Z Check that ZeroRedundancyOptimizer works with a non-default process ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 62968 2022-11-23T02:30:35.8378225Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 62969 2022-11-23T02:30:35.8378861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8379342Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8379964Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8380453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8381084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8381568Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8382183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8382674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8383143Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8383652Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8384217Z INFO:torch.testing._internal.common_distributed:Skipping `test_nondefault_process_group()` since world size of 2 is less than 4 2022-11-23T02:30:35.8384812Z INFO:torch.testing._internal.common_distributed:Skipping `test_nondefault_process_group()` since world size of 2 is less than 4 2022-11-23T02:30:35.8385222Z ok (2.610s) 2022-11-23T02:30:35.8385586Z test_sharding (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8386786Z Check ZeroRedundancyOptimizer's parameter sharding at construction ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/67295 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T02:30:35.8387676Z test_step (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8388275Z Check that ZeroRedundancyOptimizer properly exposes the ``step()`` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63036 2022-11-23T02:30:35.8388873Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63037 2022-11-23T02:30:35.8389712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8390261Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8390901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8391414Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8392035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8392503Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8393115Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8393704Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8394160Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8394677Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8395201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8395734Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8396423Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8397168Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8397595Z ok (4.613s) 2022-11-23T02:30:35.8397955Z test_step_with_closure (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8398568Z Check that ZeroRedundancyOptimizer properly exposes the ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63119 2022-11-23T02:30:35.8399148Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63120 2022-11-23T02:30:35.8399806Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8400278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8400893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8401401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8402019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8402488Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8403105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8403615Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8404075Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8404584Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8405108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8405640Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8406329Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8407074Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8407505Z ok (4.614s) 2022-11-23T02:30:35.8407856Z test_zero_join_cpu (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8408501Z Check that the ZeRO join hook allows training with uneven inputs ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63202 2022-11-23T02:30:35.8409094Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63203 2022-11-23T02:30:35.8409753Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8410220Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8410836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8411341Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8411958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8412482Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8413101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8413610Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8414064Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8414576Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8415102Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8415634Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8416322Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8417071Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8417663Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxjnsq96a 2022-11-23T02:30:35.8418237Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxjnsq96a/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8418788Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwhnj4p_h 2022-11-23T02:30:35.8419363Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwhnj4p_h/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8419915Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8420417Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8421114Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-11-23T02:30:35.8421605Z _warnings.warn(warn_message, ResourceWarning) 2022-11-23T02:30:35.8422226Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-11-23T02:30:35.8422702Z _warnings.warn(warn_message, ResourceWarning) 2022-11-23T02:30:35.8423003Z ok (2.609s) 2022-11-23T02:30:35.8423371Z test_zero_join_gpu (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8423944Z Check that the ZeRO join hook allows training with uneven inputs ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63280 2022-11-23T02:30:35.8424519Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63281 2022-11-23T02:30:35.8425174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8425661Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8426269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8426772Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8427459Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8427935Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8428554Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8429224Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8429706Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8430211Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:30:35.8430820Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8431350Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8432067Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8432800Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:30:35.8433388Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiwdd69jt 2022-11-23T02:30:35.8433961Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiwdd69jt/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8434512Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjgatyc92 2022-11-23T02:30:35.8435081Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjgatyc92/_remote_module_non_scriptable.py 2022-11-23T02:30:35.8435634Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8436155Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:30:35.8436838Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-11-23T02:30:35.8437334Z _warnings.warn(warn_message, ResourceWarning) 2022-11-23T02:30:35.8437958Z /opt/conda/lib/python3.10/tempfile.py:837: ResourceWarning: Implicitly cleaning up 2022-11-23T02:30:35.8438446Z _warnings.warn(warn_message, ResourceWarning) 2022-11-23T02:30:35.8438730Z ok (5.915s) 2022-11-23T02:30:35.8439166Z test_zero_model_parallel_parameters_as_bucket_view_False (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8439949Z Check that ZeRO works with model parallelism where the model's ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63364 2022-11-23T02:30:35.8440504Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63365 2022-11-23T02:30:35.8441160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8441648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8442265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8442755Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8443377Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8443863Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8444460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8444968Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8445438Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8446016Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8446436Z skip: Need at least 4 CUDA devices (2.509s) 2022-11-23T02:30:35.8446922Z test_zero_model_parallel_parameters_as_bucket_view_True (__main__.TestZeroRedundancyOptimizerDistributed) 2022-11-23T02:30:35.8447708Z Check that ZeRO works with model parallelism where the model's ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63432 2022-11-23T02:30:35.8448289Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63433 2022-11-23T02:30:35.8448920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8449466Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8450086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8450578Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8451206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8451691Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8452307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8452793Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8453262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:30:35.8453775Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8454186Z skip: Need at least 4 CUDA devices (2.509s) 2022-11-23T02:30:35.8454597Z test_constructor (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-11-23T02:30:35.8455207Z Check the robustness of the ZeroRedundancyOptimizer constructor by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63500 2022-11-23T02:30:35.8455970Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8456439Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8457052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8457557Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8458025Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8458527Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8459233Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:30:35.8459662Z ok (2.408s) 2022-11-23T02:30:35.8460017Z test_lr_scheduler (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-11-23T02:30:35.8460594Z Check that a normal PyTorch ``lr_scheduler`` is usable with ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63535 2022-11-23T02:30:35.8461323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8461807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8462405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8462908Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8463382Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8463889Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8464649Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:30:35.8465084Z ok (3.910s) 2022-11-23T02:30:35.8465466Z test_same_dense_param_type (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-11-23T02:30:35.8466068Z Check that ZeroRedundancyOptimizer raises an exception if the input ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63577 2022-11-23T02:30:35.8466833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8467320Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8467937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8468490Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8469114Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8469655Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8470350Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:30:35.8470777Z ok (2.408s) 2022-11-23T02:30:35.8471142Z test_state_dict (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-11-23T02:30:35.8471750Z Check that ZeroRedundancyOptimizer exposes the expected state dict ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63612 2022-11-23T02:30:35.8472492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8472981Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8473635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8474131Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8474606Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8475136Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8475848Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:30:35.8476254Z ok (3.912s) 2022-11-23T02:30:35.8476640Z test_step_with_extra_inner_key (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-11-23T02:30:35.8477264Z Check that ZeroRedundancyOptimizer wrapping an optimizer that adds ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63654 2022-11-23T02:30:35.8478030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8478501Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8479121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8479628Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8480079Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8480602Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8481301Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:30:35.8481715Z ok (3.910s) 2022-11-23T02:30:35.8482100Z test_step_with_kwargs (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-11-23T02:30:35.8482680Z Check that the ``step(**kwargs)`` interface is properly exposed. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63696 2022-11-23T02:30:35.8483504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8484003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8484623Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8485111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8485581Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8486104Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8486890Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:30:35.8487296Z ok (3.910s) 2022-11-23T02:30:35.8487673Z test_step_without_closure (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-11-23T02:30:35.8488275Z Check that the ``step()`` method (without closure) is handled as ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63738 2022-11-23T02:30:35.8488991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8489475Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8490086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8490595Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8491047Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8491577Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8492283Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:30:35.8492705Z ok (3.910s) 2022-11-23T02:30:35.8493045Z test_zero_grad (__main__.TestZeroRedundancyOptimizerSingleRank) 2022-11-23T02:30:35.8493613Z Check that the ``zero_grad`` method is properly handled. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63780 2022-11-23T02:30:35.8494338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:30:35.8494803Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:30:35.8495420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:30:35.8495929Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:30:35.8496401Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:30:35.8496909Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:30:35.8497613Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:30:35.8498034Z ok (2.507s) 2022-11-23T02:30:35.8498188Z 2022-11-23T02:30:35.8498456Z ---------------------------------------------------------------------- 2022-11-23T02:30:35.8498806Z Ran 42 tests in 177.659s 2022-11-23T02:30:35.8498979Z 2022-11-23T02:30:35.8499090Z OK (skipped=4) 2022-11-23T02:30:35.8499254Z 2022-11-23T02:30:35.8499383Z Generating XML reports... 2022-11-23T02:30:35.8500122Z Generated XML report: test-reports/python-unittest/distributed.optim.test_zero_redundancy_optimizer/TEST-TestZeroRedundancyOptimizerDistributed-20221123022737.xml 2022-11-23T02:30:35.8501169Z Generated XML report: test-reports/python-unittest/distributed.optim.test_zero_redundancy_optimizer/TEST-TestZeroRedundancyOptimizerSingleRank-20221123022737.xml 2022-11-23T02:30:35.8501639Z 2022-11-23T02:30:35.8502074Z ##[endgroup] 2022-11-23T02:30:35.8502829Z FINISHED PRINTING LOG FILE of distributed/optim/test_zero_redundancy_optimizer (/var/lib/jenkins/workspace/test/test-reports/distributed-optim-test_zero_redundancy_optimizer_pk2frmsx) 2022-11-23T02:30:35.8503278Z 2022-11-23T02:30:35.8503578Z Running distributed/fsdp/test_fsdp_optim_state ... [2022-11-23 02:30:35.794269] 2022-11-23T02:30:35.8504326Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_optim_state.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:30:35.794560] 2022-11-23T02:34:47.1294044Z 2022-11-23T02:34:47.1294738Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_optim_state 2022-11-23T02:34:47.1297723Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_optim_state (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_optim_state_ch4des5k) 2022-11-23T02:34:47.1298405Z 2022-11-23T02:34:47.1298592Z Running tests... 2022-11-23T02:34:47.1301949Z ---------------------------------------------------------------------- 2022-11-23T02:34:47.1303053Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_optim_state 2022-11-23T02:34:47.1306321Z test_flatten_sharded_optim_state_dict_nested (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1307533Z Tests :meth:`flatten_sharded_optim_state_dict` for an FSDP-root ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:34:47.1308458Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63852 2022-11-23T02:34:47.1309192Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63853 2022-11-23T02:34:47.1309904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1310383Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1310979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1311444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1312036Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1312491Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1313077Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1313537Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1314349Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1315138Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1316479Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1317798Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1318750Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1319622Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1321361Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1322621Z warnings.warn( 2022-11-23T02:34:47.1324302Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1325378Z warnings.warn( 2022-11-23T02:34:47.1326867Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1327742Z warnings.warn( 2022-11-23T02:34:47.1328801Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1329366Z warnings.warn( 2022-11-23T02:34:47.1329620Z dist init r=0, world=2 2022-11-23T02:34:47.1329967Z dist init r=1, world=2 2022-11-23T02:34:47.1330208Z ok (6.375s) 2022-11-23T02:34:47.1330560Z test_flatten_sharded_optim_state_dict_transformer (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1331248Z Tests :meth:`flatten_sharded_optim_state_dict` for an FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 63935 2022-11-23T02:34:47.1332195Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 63936 2022-11-23T02:34:47.1333381Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1334218Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1335286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1336295Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1337424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1338212Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1339289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1340123Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1340948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1341877Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1343095Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1344373Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1345317Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1346199Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1347802Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1348775Z warnings.warn( 2022-11-23T02:34:47.1350596Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1351157Z warnings.warn( 2022-11-23T02:34:47.1351390Z dist init r=1, world=2 2022-11-23T02:34:47.1351639Z dist init r=0, world=2 2022-11-23T02:34:47.1351879Z ok (5.714s) 2022-11-23T02:34:47.1352186Z test_full_optim_state_dict_keys (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1352681Z Tests that the parameter keys returned by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64018 2022-11-23T02:34:47.1353308Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64019 2022-11-23T02:34:47.1353938Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1354400Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1354982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1355462Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1356030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1356482Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1357143Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1357622Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1358069Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1358569Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1359242Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1359924Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1360456Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1360936Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1361304Z dist init r=0, world=2 2022-11-23T02:34:47.1361536Z dist init r=1, world=2 2022-11-23T02:34:47.1361771Z ok (4.512s) 2022-11-23T02:34:47.1362106Z test_full_optim_state_dict_nested_invalid (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1362627Z Tests that :meth:`full_optim_state_dict` raises an error when ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64101 2022-11-23T02:34:47.1363162Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64102 2022-11-23T02:34:47.1363782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1364243Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1364810Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1365292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1366365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1366806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1367394Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1367867Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1368324Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1368812Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1369477Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1370231Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1370767Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1371289Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1371654Z dist init r=0, world=2 2022-11-23T02:34:47.1371909Z dist init r=1, world=2 2022-11-23T02:34:47.1372130Z ok (4.512s) 2022-11-23T02:34:47.1372436Z test_optim_input_warning (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1372957Z Tests that passing the ``optim_input`` argument into optimizer state ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64184 2022-11-23T02:34:47.1373484Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64185 2022-11-23T02:34:47.1374108Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1374620Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1375202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1375663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1376247Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1376692Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1377269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1377726Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1378184Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1378685Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1379341Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1380042Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1380576Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1381057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1381402Z dist init r=0, world=2 2022-11-23T02:34:47.1381653Z dist init r=1, world=2 2022-11-23T02:34:47.1381895Z ok (4.612s) 2022-11-23T02:34:47.1382368Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1383050Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64267 2022-11-23T02:34:47.1383594Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64268 2022-11-23T02:34:47.1384215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1384655Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1385239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1385717Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1386302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1386733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1387311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1387788Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1388234Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1388794Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1389982Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1390690Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1391206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1391685Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1392598Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1393282Z warnings.warn( 2022-11-23T02:34:47.1394076Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1394639Z warnings.warn( 2022-11-23T02:34:47.1394889Z dist init r=1, world=2 2022-11-23T02:34:47.1395120Z dist init r=0, world=2 2022-11-23T02:34:47.1395357Z ok (4.512s) 2022-11-23T02:34:47.1395840Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1396516Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64350 2022-11-23T02:34:47.1397050Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64351 2022-11-23T02:34:47.1397675Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1398136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1398721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1399181Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1399770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1400217Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1400778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1401259Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1401725Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1402230Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1402879Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1403576Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1404110Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1404590Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1405494Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1406066Z warnings.warn( 2022-11-23T02:34:47.1406936Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1407506Z warnings.warn( 2022-11-23T02:34:47.1407739Z dist init r=0, world=2 2022-11-23T02:34:47.1407990Z dist init r=1, world=2 2022-11-23T02:34:47.1408227Z ok (4.612s) 2022-11-23T02:34:47.1408703Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1409429Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64433 2022-11-23T02:34:47.1409973Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64434 2022-11-23T02:34:47.1410602Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1411046Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1411630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1412106Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1412671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1413125Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1413705Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1414176Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1414621Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1415126Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1415793Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1416494Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1417007Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1417487Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1418414Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1418985Z warnings.warn( 2022-11-23T02:34:47.1419772Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1420340Z warnings.warn( 2022-11-23T02:34:47.1420589Z dist init r=1, world=2 2022-11-23T02:34:47.1420821Z dist init r=0, world=2 2022-11-23T02:34:47.1421059Z ok (4.612s) 2022-11-23T02:34:47.1421541Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1422213Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64516 2022-11-23T02:34:47.1422811Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64517 2022-11-23T02:34:47.1423445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1423903Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1424487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1424949Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1425535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1425983Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1426604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1427083Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1427546Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1428044Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1428693Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1429933Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1430469Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1430953Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1431870Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1432451Z warnings.warn( 2022-11-23T02:34:47.1433254Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1433809Z warnings.warn( 2022-11-23T02:34:47.1434038Z dist init r=1, world=2 2022-11-23T02:34:47.1434288Z dist init r=0, world=2 2022-11-23T02:34:47.1434523Z ok (4.612s) 2022-11-23T02:34:47.1434989Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1435665Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64599 2022-11-23T02:34:47.1436211Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64600 2022-11-23T02:34:47.1436831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1437270Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1437851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1438324Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1438893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1439347Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1439927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1440398Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1440929Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1441450Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1442117Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1442813Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1443331Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1443880Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1444805Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1445379Z warnings.warn( 2022-11-23T02:34:47.1446168Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1446728Z warnings.warn( 2022-11-23T02:34:47.1446977Z dist init r=1, world=2 2022-11-23T02:34:47.1447209Z dist init r=0, world=2 2022-11-23T02:34:47.1447445Z ok (4.512s) 2022-11-23T02:34:47.1447927Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1448608Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64682 2022-11-23T02:34:47.1449138Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64683 2022-11-23T02:34:47.1449759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1450213Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1450835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1451392Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1452052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1452571Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1453286Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1453858Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1454397Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1454976Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1455664Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1456484Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1457086Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1476766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1477961Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1478549Z warnings.warn( 2022-11-23T02:34:47.1479349Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1479898Z warnings.warn( 2022-11-23T02:34:47.1480145Z dist init r=0, world=2 2022-11-23T02:34:47.1480383Z dist init r=1, world=2 2022-11-23T02:34:47.1480617Z ok (4.511s) 2022-11-23T02:34:47.1481091Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1481837Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64765 2022-11-23T02:34:47.1482376Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64766 2022-11-23T02:34:47.1483006Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1483468Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1484037Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1484514Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1485100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1485538Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1486118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1486595Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1487041Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1487534Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1488183Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1488862Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1489376Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1489849Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1490775Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1491333Z warnings.warn( 2022-11-23T02:34:47.1492133Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1492677Z warnings.warn( 2022-11-23T02:34:47.1492914Z dist init r=1, world=2 2022-11-23T02:34:47.1493147Z dist init r=0, world=2 2022-11-23T02:34:47.1493387Z ok (4.612s) 2022-11-23T02:34:47.1493875Z test_optim_state_dict_nested_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1494597Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64848 2022-11-23T02:34:47.1495158Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64849 2022-11-23T02:34:47.1495780Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1496240Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1496809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1497281Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1497850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1498341Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1498895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1499343Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1499781Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1500270Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1500936Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1501639Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1502183Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1502649Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1503569Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1504143Z warnings.warn( 2022-11-23T02:34:47.1504945Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1505487Z warnings.warn( 2022-11-23T02:34:47.1505739Z dist init r=1, world=2 2022-11-23T02:34:47.1505991Z dist init r=0, world=2 2022-11-23T02:34:47.1506212Z ok (4.712s) 2022-11-23T02:34:47.1506707Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1507391Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 64931 2022-11-23T02:34:47.1507930Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 64932 2022-11-23T02:34:47.1508534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1509427Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1510065Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1510546Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1511125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1511571Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1512238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1512708Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1513170Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1513674Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1514340Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1515021Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1515621Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1516103Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1516990Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1517529Z warnings.warn( 2022-11-23T02:34:47.1518296Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1518839Z warnings.warn( 2022-11-23T02:34:47.1519641Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1520189Z warnings.warn( 2022-11-23T02:34:47.1520992Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1521559Z warnings.warn( 2022-11-23T02:34:47.1521807Z dist init r=0, world=2 2022-11-23T02:34:47.1522039Z dist init r=1, world=2 2022-11-23T02:34:47.1522275Z ok (4.612s) 2022-11-23T02:34:47.1522764Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1523426Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65014 2022-11-23T02:34:47.1523977Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65015 2022-11-23T02:34:47.1524594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1525057Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1525612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1526062Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1526646Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1527122Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1527698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1528174Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1528638Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1529186Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1529864Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1530567Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1531105Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1531568Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1532448Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1533078Z warnings.warn( 2022-11-23T02:34:47.1533854Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1534389Z warnings.warn( 2022-11-23T02:34:47.1535192Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1535759Z warnings.warn( 2022-11-23T02:34:47.1536555Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1537099Z warnings.warn( 2022-11-23T02:34:47.1537347Z dist init r=0, world=2 2022-11-23T02:34:47.1537605Z dist init r=1, world=2 2022-11-23T02:34:47.1537827Z ok (4.717s) 2022-11-23T02:34:47.1538320Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1539002Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65097 2022-11-23T02:34:47.1539556Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65098 2022-11-23T02:34:47.1540162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1540617Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1541204Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1541680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1542253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1542703Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1543284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1543732Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1544198Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1544710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1545386Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1546075Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1546670Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1547158Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1547526Z dist init r=0, world=2 2022-11-23T02:34:47.1547763Z dist init r=1, world=2 2022-11-23T02:34:47.1548007Z ok (3.912s) 2022-11-23T02:34:47.1548497Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1549567Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65176 2022-11-23T02:34:47.1550225Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65177 2022-11-23T02:34:47.1550866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1551329Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1551899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1552372Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1552960Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1553394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1553961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1554439Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1554901Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1555395Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1556069Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1556774Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1557316Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1557780Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1558140Z dist init r=1, world=2 2022-11-23T02:34:47.1558400Z dist init r=0, world=2 2022-11-23T02:34:47.1558625Z ok (3.912s) 2022-11-23T02:34:47.1559117Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1559807Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65255 2022-11-23T02:34:47.1560357Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65256 2022-11-23T02:34:47.1560963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1561426Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1562014Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1562474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1563070Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1563521Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1564206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1564671Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1565135Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1565643Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1566296Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1566993Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1567580Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1568064Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1568936Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1569499Z warnings.warn( 2022-11-23T02:34:47.1570316Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1570868Z warnings.warn( 2022-11-23T02:34:47.1571651Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1572225Z warnings.warn( 2022-11-23T02:34:47.1573028Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1573596Z warnings.warn( 2022-11-23T02:34:47.1573828Z dist init r=0, world=2 2022-11-23T02:34:47.1574081Z dist init r=1, world=2 2022-11-23T02:34:47.1574322Z ok (4.713s) 2022-11-23T02:34:47.1574794Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1575477Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65338 2022-11-23T02:34:47.1576025Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65339 2022-11-23T02:34:47.1576649Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1577093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1577680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1578161Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1578751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1579183Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1579764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1580244Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1580711Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1581256Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1581933Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1582638Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1583152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1583635Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1584512Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1585121Z warnings.warn( 2022-11-23T02:34:47.1585877Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1586432Z warnings.warn( 2022-11-23T02:34:47.1587235Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1587804Z warnings.warn( 2022-11-23T02:34:47.1588579Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1589583Z warnings.warn( 2022-11-23T02:34:47.1589837Z dist init r=0, world=2 2022-11-23T02:34:47.1590098Z dist init r=1, world=2 2022-11-23T02:34:47.1590322Z ok (4.612s) 2022-11-23T02:34:47.1590813Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1591492Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65421 2022-11-23T02:34:47.1592023Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65422 2022-11-23T02:34:47.1592658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1593123Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1593707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1594173Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1594768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1595222Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1595803Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1596263Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1596727Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1597240Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1597896Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1598701Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1599247Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1599730Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1600077Z dist init r=1, world=2 2022-11-23T02:34:47.1600333Z dist init r=0, world=2 2022-11-23T02:34:47.1600577Z ok (3.911s) 2022-11-23T02:34:47.1601049Z test_optim_state_dict_nested_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1601727Z Tests :meth:`full_optim_state_dict` and meth:`sharded_optim_state_dict` ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65500 2022-11-23T02:34:47.1602342Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65501 2022-11-23T02:34:47.1602979Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1603422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1604011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1604489Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1605134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1605573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1606156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1606635Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1607085Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1607595Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1608260Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1608966Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1609484Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1609964Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1610332Z dist init r=0, world=2 2022-11-23T02:34:47.1610569Z dist init r=1, world=2 2022-11-23T02:34:47.1610812Z ok (4.011s) 2022-11-23T02:34:47.1611253Z test_rekey_optim_state_dict_to_ids_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1611877Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65579 2022-11-23T02:34:47.1612394Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65580 2022-11-23T02:34:47.1613013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1613473Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1614053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1614513Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1615106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1615559Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1616172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1616657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1617121Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1617635Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1618287Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1618990Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1619576Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1620057Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1620964Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1621544Z warnings.warn( 2022-11-23T02:34:47.1622346Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1622906Z warnings.warn( 2022-11-23T02:34:47.1623142Z dist init r=1, world=2 2022-11-23T02:34:47.1623396Z dist init r=0, world=2 2022-11-23T02:34:47.1623637Z ok (4.612s) 2022-11-23T02:34:47.1624058Z test_rekey_optim_state_dict_to_ids_state_dict_type_StateDictType_FULL_STATE_DICT_use_multiple_param_groups_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1624680Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65662 2022-11-23T02:34:47.1625217Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65663 2022-11-23T02:34:47.1625838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1626281Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1626864Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1627342Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1627914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1628366Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1629406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1630019Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1630460Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1630970Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1631642Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1632328Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1632868Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1633354Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1634352Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1634923Z warnings.warn( 2022-11-23T02:34:47.1635735Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1636302Z warnings.warn( 2022-11-23T02:34:47.1636556Z dist init r=0, world=2 2022-11-23T02:34:47.1636854Z dist init r=1, world=2 2022-11-23T02:34:47.1637097Z ok (4.712s) 2022-11-23T02:34:47.1637544Z test_rekey_optim_state_dict_to_ids_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1638152Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65745 2022-11-23T02:34:47.1638689Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65746 2022-11-23T02:34:47.1639315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1639776Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1640345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1640830Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1641427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1641880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1642444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1642921Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1643383Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1643876Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1644549Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1645255Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1645797Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1646265Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1647150Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1647710Z warnings.warn( 2022-11-23T02:34:47.1648478Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1649015Z warnings.warn( 2022-11-23T02:34:47.1649814Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1650384Z warnings.warn( 2022-11-23T02:34:47.1651233Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1651776Z warnings.warn( 2022-11-23T02:34:47.1652025Z dist init r=0, world=2 2022-11-23T02:34:47.1652278Z dist init r=1, world=2 2022-11-23T02:34:47.1652501Z ok (4.712s) 2022-11-23T02:34:47.1652939Z test_rekey_optim_state_dict_to_ids_state_dict_type_StateDictType_SHARDED_STATE_DICT_use_multiple_param_groups_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1653560Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65828 2022-11-23T02:34:47.1654143Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65829 2022-11-23T02:34:47.1654746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1655208Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1655795Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1656275Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1656842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1657296Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1657873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1658331Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1658791Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1659310Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1659977Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1660660Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1661194Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1661671Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1662559Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1663102Z warnings.warn( 2022-11-23T02:34:47.1663874Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1664429Z warnings.warn( 2022-11-23T02:34:47.1665232Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1665783Z warnings.warn( 2022-11-23T02:34:47.1666582Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1667149Z warnings.warn( 2022-11-23T02:34:47.1667399Z dist init r=1, world=2 2022-11-23T02:34:47.1667633Z dist init r=0, world=2 2022-11-23T02:34:47.1667875Z ok (4.712s) 2022-11-23T02:34:47.1668256Z test_rekey_optim_state_dict_to_names (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1668772Z Tests :meth:`rekey_optim_state_dict` with the new keys being ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65911 2022-11-23T02:34:47.1669485Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65912 2022-11-23T02:34:47.1670147Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1670586Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1671171Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1671726Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1672315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1672752Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1673332Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1673802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1674262Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1674750Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1675416Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1676124Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1676639Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1677126Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1678039Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1678610Z warnings.warn( 2022-11-23T02:34:47.1679399Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1679958Z warnings.warn( 2022-11-23T02:34:47.1680210Z dist init r=1, world=2 2022-11-23T02:34:47.1680466Z dist init r=0, world=2 2022-11-23T02:34:47.1680689Z ok (4.712s) 2022-11-23T02:34:47.1681093Z test_save_load_without_0th_param_state_state_dict_type_StateDictType_FULL_STATE_DICT (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1681694Z Tests saving and loading an optim state dict for Adam optimizer (i.e. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 65994 2022-11-23T02:34:47.1682223Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 65995 2022-11-23T02:34:47.1682842Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1683297Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1683881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1684349Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1684939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1685457Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1686048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1686510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1686971Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1687485Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1688140Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1688905Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1689440Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1689925Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1690267Z dist init r=1, world=2 2022-11-23T02:34:47.1690527Z dist init r=0, world=2 2022-11-23T02:34:47.1690772Z ok (4.412s) 2022-11-23T02:34:47.1691159Z test_save_load_without_0th_param_state_state_dict_type_StateDictType_SHARDED_STATE_DICT (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1691768Z Tests saving and loading an optim state dict for Adam optimizer (i.e. ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66077 2022-11-23T02:34:47.1692313Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66078 2022-11-23T02:34:47.1692933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1693376Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1693966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1694445Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1695019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1695475Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1696055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1696532Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1696975Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1697489Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1698158Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1698860Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1699375Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1699857Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1700741Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1701298Z warnings.warn( 2022-11-23T02:34:47.1702054Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1702607Z warnings.warn( 2022-11-23T02:34:47.1702909Z dist init r=1, world=2 2022-11-23T02:34:47.1703152Z dist init r=0, world=2 2022-11-23T02:34:47.1703393Z ok (4.512s) 2022-11-23T02:34:47.1703754Z test_scatter_full_optim_state_dict_nested_halve_world_size (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1704440Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66160 2022-11-23T02:34:47.1704986Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66161 2022-11-23T02:34:47.1705604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1706115Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1706686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1707166Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1707755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1708206Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1708768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1709730Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1710623Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1711547Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1712271Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1712977Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1713513Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1713980Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1714465Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:34:47.1714975Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:34:47.1715643Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:34:47.1716327Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:34:47.1717299Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1717884Z warnings.warn( 2022-11-23T02:34:47.1718686Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1719223Z warnings.warn( 2022-11-23T02:34:47.1719608Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:34:47.1720117Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:34:47.1720788Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:34:47.1721580Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:34:47.1722004Z dist init r=0, world=2 2022-11-23T02:34:47.1722260Z dist init r=1, world=2 2022-11-23T02:34:47.1722487Z ok (4.812s) 2022-11-23T02:34:47.1722933Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1723702Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66253 2022-11-23T02:34:47.1724252Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66254 2022-11-23T02:34:47.1724851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1725386Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1725977Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1726436Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1727025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1727475Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1728053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1728510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1728974Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1729491Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1730158Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1730845Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1731380Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1731860Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1732783Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1733346Z warnings.warn( 2022-11-23T02:34:47.1734146Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1734715Z warnings.warn( 2022-11-23T02:34:47.1734947Z dist init r=0, world=2 2022-11-23T02:34:47.1735208Z dist init r=1, world=2 2022-11-23T02:34:47.1735449Z ok (4.912s) 2022-11-23T02:34:47.1735892Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1736643Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66336 2022-11-23T02:34:47.1737191Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66337 2022-11-23T02:34:47.1737807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1738251Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1738887Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1739371Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1739962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1740394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1740974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1741447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1741911Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1742456Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1743120Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1743827Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1744345Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1744825Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1745738Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1746320Z warnings.warn( 2022-11-23T02:34:47.1747109Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1747673Z warnings.warn( 2022-11-23T02:34:47.1747925Z dist init r=1, world=2 2022-11-23T02:34:47.1748178Z dist init r=0, world=2 2022-11-23T02:34:47.1748404Z ok (4.912s) 2022-11-23T02:34:47.1748846Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1749790Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66419 2022-11-23T02:34:47.1750321Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66420 2022-11-23T02:34:47.1750937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1751398Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1751989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1752452Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1753039Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1753489Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1754069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1754527Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1754987Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1755502Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1756155Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1756932Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1757478Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1757961Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1758862Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1759439Z warnings.warn( 2022-11-23T02:34:47.1760301Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1760870Z warnings.warn( 2022-11-23T02:34:47.1761103Z dist init r=1, world=2 2022-11-23T02:34:47.1761356Z dist init r=0, world=2 2022-11-23T02:34:47.1761596Z ok (4.812s) 2022-11-23T02:34:47.1762018Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1762791Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66502 2022-11-23T02:34:47.1763339Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66503 2022-11-23T02:34:47.1763958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1764402Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1764988Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1765473Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1766067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1766500Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1767085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1767559Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1768003Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1768515Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1769184Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1769938Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1770456Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1770939Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1771853Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1772429Z warnings.warn( 2022-11-23T02:34:47.1773225Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1773835Z warnings.warn( 2022-11-23T02:34:47.1774092Z dist init r=0, world=2 2022-11-23T02:34:47.1774325Z dist init r=1, world=2 2022-11-23T02:34:47.1774566Z ok (4.712s) 2022-11-23T02:34:47.1775007Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1775775Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66585 2022-11-23T02:34:47.1776304Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66586 2022-11-23T02:34:47.1776920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1777431Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1778016Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1778480Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1779071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1779523Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1780085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1780561Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1781023Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1781539Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1782191Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1782895Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1783423Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1783906Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1784805Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1785380Z warnings.warn( 2022-11-23T02:34:47.1786183Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1786746Z warnings.warn( 2022-11-23T02:34:47.1786976Z dist init r=1, world=2 2022-11-23T02:34:47.1787230Z dist init r=0, world=2 2022-11-23T02:34:47.1787473Z ok (4.912s) 2022-11-23T02:34:47.1787894Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1788663Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66668 2022-11-23T02:34:47.1789375Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66669 2022-11-23T02:34:47.1789999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1790445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1791028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1791581Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1792160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1792610Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1793187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1793668Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1794109Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1794688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1795358Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1796060Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1796580Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1797063Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1797984Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1798556Z warnings.warn( 2022-11-23T02:34:47.1799351Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1799923Z warnings.warn( 2022-11-23T02:34:47.1800178Z dist init r=0, world=2 2022-11-23T02:34:47.1800416Z dist init r=1, world=2 2022-11-23T02:34:47.1800662Z ok (4.912s) 2022-11-23T02:34:47.1801102Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1801849Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66751 2022-11-23T02:34:47.1802397Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66752 2022-11-23T02:34:47.1803018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1803481Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1804050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1804533Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1805120Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1805573Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1806134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1806612Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1807077Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1807575Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1808246Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1809000Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1809542Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1810005Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1810919Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1811486Z warnings.warn( 2022-11-23T02:34:47.1812350Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1812889Z warnings.warn( 2022-11-23T02:34:47.1813146Z dist init r=0, world=2 2022-11-23T02:34:47.1813399Z dist init r=1, world=2 2022-11-23T02:34:47.1813620Z ok (4.712s) 2022-11-23T02:34:47.1814058Z test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1814828Z Tests :meth:`scatter_full_optim_state_dict` for a non-FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66834 2022-11-23T02:34:47.1815385Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66835 2022-11-23T02:34:47.1815983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1816445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1817029Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1817514Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1818085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1818542Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1819126Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1819583Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1820047Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1820566Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1821236Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1821925Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1822458Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1822937Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1823855Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1824410Z warnings.warn( 2022-11-23T02:34:47.1825219Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1825790Z warnings.warn( 2022-11-23T02:34:47.1826091Z dist init r=0, world=2 2022-11-23T02:34:47.1826331Z dist init r=1, world=2 2022-11-23T02:34:47.1826573Z ok (4.712s) 2022-11-23T02:34:47.1826919Z test_scatter_full_optim_state_dict_transformer (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1827574Z Tests :meth:`scatter_full_optim_state_dict` for an FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 66917 2022-11-23T02:34:47.1828116Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 66918 2022-11-23T02:34:47.1828736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1829476Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1830053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1830531Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1831119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1831550Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1832132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1832606Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1833065Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1833557Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1834229Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1834937Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1835477Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1835944Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1836434Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:34:47.1836941Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:34:47.1837593Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:34:47.1838292Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:34:47.1838838Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:34:47.1839344Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:34:47.1839982Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:34:47.1840679Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:34:47.1841090Z dist init r=1, world=2 2022-11-23T02:34:47.1841347Z dist init r=0, world=2 2022-11-23T02:34:47.1841574Z ok (5.313s) 2022-11-23T02:34:47.1841930Z test_shard_full_optim_state_dict_nested_halve_world_size (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1842634Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67010 2022-11-23T02:34:47.1843171Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67011 2022-11-23T02:34:47.1843879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1844348Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1844931Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1845390Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1845980Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1846436Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1846995Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1847539Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1848003Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1848521Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1849171Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1849868Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1850404Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1850885Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1851361Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:34:47.1851877Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:34:47.1852543Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:34:47.1853224Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:34:47.1854194Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1854773Z warnings.warn( 2022-11-23T02:34:47.1855573Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1856136Z warnings.warn( 2022-11-23T02:34:47.1856503Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:34:47.1857017Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:34:47.1857682Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:34:47.1858364Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:34:47.1858782Z dist init r=0, world=2 2022-11-23T02:34:47.1859040Z dist init r=1, world=2 2022-11-23T02:34:47.1859266Z ok (4.812s) 2022-11-23T02:34:47.1859706Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1860483Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67103 2022-11-23T02:34:47.1861035Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67104 2022-11-23T02:34:47.1861684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1862153Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1862741Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1863222Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1863794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1864253Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1864886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1865347Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1865816Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1866329Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1866999Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1867680Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1868210Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1868692Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1869892Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1870453Z warnings.warn( 2022-11-23T02:34:47.1871252Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1871822Z warnings.warn( 2022-11-23T02:34:47.1872073Z dist init r=0, world=2 2022-11-23T02:34:47.1872308Z dist init r=1, world=2 2022-11-23T02:34:47.1872549Z ok (4.813s) 2022-11-23T02:34:47.1872989Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1873747Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67186 2022-11-23T02:34:47.1874300Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67187 2022-11-23T02:34:47.1874921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1875380Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1875946Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1876426Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1877017Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1877452Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1878035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1878505Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1879049Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1879556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1880225Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1880932Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1881467Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1881928Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1882908Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1883477Z warnings.warn( 2022-11-23T02:34:47.1884279Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1884818Z warnings.warn( 2022-11-23T02:34:47.1885071Z dist init r=1, world=2 2022-11-23T02:34:47.1885324Z dist init r=0, world=2 2022-11-23T02:34:47.1885545Z ok (4.812s) 2022-11-23T02:34:47.1885984Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1886756Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67269 2022-11-23T02:34:47.1887305Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67270 2022-11-23T02:34:47.1887909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1888369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1888955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1889416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1890000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1890449Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1891032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1891493Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1891958Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1892469Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1893138Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1893825Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1894361Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1894841Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1895816Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1896378Z warnings.warn( 2022-11-23T02:34:47.1897035Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1897147Z warnings.warn( 2022-11-23T02:34:47.1897260Z dist init r=1, world=2 2022-11-23T02:34:47.1897372Z dist init r=0, world=2 2022-11-23T02:34:47.1897474Z ok (4.712s) 2022-11-23T02:34:47.1897784Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1898269Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67352 2022-11-23T02:34:47.1898494Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67353 2022-11-23T02:34:47.1898871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1899050Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1899436Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1899629Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1899998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1900174Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1900562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1900738Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1900992Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1901244Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1901649Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1902051Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1902291Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1902523Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1903191Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1903306Z warnings.warn( 2022-11-23T02:34:47.1903947Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1904058Z warnings.warn( 2022-11-23T02:34:47.1904171Z dist init r=1, world=2 2022-11-23T02:34:47.1904282Z dist init r=0, world=2 2022-11-23T02:34:47.1904382Z ok (4.812s) 2022-11-23T02:34:47.1904692Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1905202Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67435 2022-11-23T02:34:47.1905427Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67436 2022-11-23T02:34:47.1905835Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1906021Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1906408Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1906605Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1906975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1907149Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1907581Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1907774Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1908029Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1908259Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1908666Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1909289Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1909530Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1909763Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1910432Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1910548Z warnings.warn( 2022-11-23T02:34:47.1911214Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1911324Z warnings.warn( 2022-11-23T02:34:47.1911417Z dist init r=0, world=2 2022-11-23T02:34:47.1911527Z dist init r=1, world=2 2022-11-23T02:34:47.1911628Z ok (4.912s) 2022-11-23T02:34:47.1911936Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1912395Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67518 2022-11-23T02:34:47.1912618Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67519 2022-11-23T02:34:47.1912998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1913176Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1913562Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1913738Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1914111Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1914285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1914668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1914861Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1915182Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1915438Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1915846Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1916232Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1916464Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1916695Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1917423Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1917541Z warnings.warn( 2022-11-23T02:34:47.1918208Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1918319Z warnings.warn( 2022-11-23T02:34:47.1918431Z dist init r=0, world=2 2022-11-23T02:34:47.1918542Z dist init r=1, world=2 2022-11-23T02:34:47.1918625Z ok (4.812s) 2022-11-23T02:34:47.1918930Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1919388Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67601 2022-11-23T02:34:47.1919612Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67602 2022-11-23T02:34:47.1919989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1920167Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1920549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1920745Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1921114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1921269Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1921651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1921844Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1922097Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1922347Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1922752Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1923155Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1923388Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1923620Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1924273Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1924435Z warnings.warn( 2022-11-23T02:34:47.1925109Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1925221Z warnings.warn( 2022-11-23T02:34:47.1925335Z dist init r=0, world=2 2022-11-23T02:34:47.1925446Z dist init r=1, world=2 2022-11-23T02:34:47.1925546Z ok (4.712s) 2022-11-23T02:34:47.1925853Z test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1926358Z Tests :meth:`shard_full_optim_state_dict` for a non-FSDP-root model ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67684 2022-11-23T02:34:47.1926562Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67685 2022-11-23T02:34:47.1926942Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1927121Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1927504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1927703Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1928074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1928253Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1928644Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1928817Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1929072Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1929322Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1929729Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1930131Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1930368Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1930601Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1931273Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1931390Z warnings.warn( 2022-11-23T02:34:47.1932055Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1932148Z warnings.warn( 2022-11-23T02:34:47.1932262Z dist init r=0, world=2 2022-11-23T02:34:47.1932374Z dist init r=1, world=2 2022-11-23T02:34:47.1932476Z ok (4.712s) 2022-11-23T02:34:47.1932691Z test_shard_full_optim_state_dict_transformer (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1933121Z Tests :meth:`shard_full_optim_state_dict` for an FSDP-root ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67767 2022-11-23T02:34:47.1933348Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67768 2022-11-23T02:34:47.1933784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1933948Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1934335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1934529Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1934899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1935074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1935450Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1935689Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1935941Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1936175Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1936583Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1936983Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1937219Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1937451Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1937696Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:34:47.1937948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:34:47.1938357Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:34:47.1938757Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:34:47.1938982Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:34:47.1939224Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:34:47.1939621Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:34:47.1940022Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:34:47.1940143Z dist init r=1, world=2 2022-11-23T02:34:47.1940255Z dist init r=0, world=2 2022-11-23T02:34:47.1940358Z ok (5.413s) 2022-11-23T02:34:47.1940691Z test_shard_full_optim_state_dict_unmanaged_params_state_dict_type_StateDictType_FULL_STATE_DICT_add_to_fsdp_module_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1941010Z Tests :meth:`shard_full_optim_state_dict` when there are unmanaged ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67860 2022-11-23T02:34:47.1941214Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67861 2022-11-23T02:34:47.1941596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1941775Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1942163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1942361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1942733Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1942956Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1943345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1943517Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1943765Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1944014Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1944419Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1944874Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1945108Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1945342Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1946010Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1946125Z warnings.warn( 2022-11-23T02:34:47.1946776Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1946873Z warnings.warn( 2022-11-23T02:34:47.1946987Z dist init r=0, world=2 2022-11-23T02:34:47.1947098Z dist init r=1, world=2 2022-11-23T02:34:47.1947199Z ok (4.512s) 2022-11-23T02:34:47.1947526Z test_shard_full_optim_state_dict_unmanaged_params_state_dict_type_StateDictType_FULL_STATE_DICT_add_to_fsdp_module_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1947843Z Tests :meth:`shard_full_optim_state_dict` when there are unmanaged ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 67943 2022-11-23T02:34:47.1948067Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 67944 2022-11-23T02:34:47.1948448Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1948607Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1949198Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1949406Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1949786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1949965Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1950349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1950541Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1950791Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1951039Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1951427Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1951833Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1952068Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1952377Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1953053Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1953168Z warnings.warn( 2022-11-23T02:34:47.1953830Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1953998Z warnings.warn( 2022-11-23T02:34:47.1954113Z dist init r=1, world=2 2022-11-23T02:34:47.1954203Z dist init r=0, world=2 2022-11-23T02:34:47.1954307Z ok (4.512s) 2022-11-23T02:34:47.1954639Z test_shard_full_optim_state_dict_unmanaged_params_state_dict_type_StateDictType_SHARDED_STATE_DICT_add_to_fsdp_module_False (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1954959Z Tests :meth:`shard_full_optim_state_dict` when there are unmanaged ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68026 2022-11-23T02:34:47.1955182Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68027 2022-11-23T02:34:47.1955564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1955745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1956131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1956312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1956683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1956861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1957243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1957437Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1957690Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1957940Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1958347Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1958754Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1958968Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1959205Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1959840Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1959954Z warnings.warn( 2022-11-23T02:34:47.1960581Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1960693Z warnings.warn( 2022-11-23T02:34:47.1961362Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1961524Z warnings.warn( 2022-11-23T02:34:47.1962183Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1962294Z warnings.warn( 2022-11-23T02:34:47.1962387Z dist init r=0, world=2 2022-11-23T02:34:47.1962497Z dist init r=1, world=2 2022-11-23T02:34:47.1962599Z ok (4.511s) 2022-11-23T02:34:47.1962927Z test_shard_full_optim_state_dict_unmanaged_params_state_dict_type_StateDictType_SHARDED_STATE_DICT_add_to_fsdp_module_True (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1963292Z Tests :meth:`shard_full_optim_state_dict` when there are unmanaged ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68109 2022-11-23T02:34:47.1963523Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68110 2022-11-23T02:34:47.1963903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1964083Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1964449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1964644Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1965019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1965195Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1965576Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1965771Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1966025Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1966277Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1966685Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1967071Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1967305Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1967538Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1968178Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1968293Z warnings.warn( 2022-11-23T02:34:47.1968926Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2455: UserWarning: torch.distributed._all_gather_base is a private function and will be deprecated. Please use torch.distributed.all_gather_into_tensor instead. 2022-11-23T02:34:47.1969037Z warnings.warn( 2022-11-23T02:34:47.1969739Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1969850Z warnings.warn( 2022-11-23T02:34:47.1970497Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:1232: UserWarning: The `optim_input` argument is deprecated and will be removed after PyTorch 1.13. You may remove it from your code without changing its functionality. 2022-11-23T02:34:47.1970592Z warnings.warn( 2022-11-23T02:34:47.1970767Z dist init r=1, world=2 2022-11-23T02:34:47.1970881Z dist init r=0, world=2 2022-11-23T02:34:47.1970984Z ok (4.511s) 2022-11-23T02:34:47.1971168Z test_use_orig_params_error (__main__.TestFSDPOptimState) 2022-11-23T02:34:47.1971492Z Tests that the optimizer state checkpointing APIs raise an error ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68192 2022-11-23T02:34:47.1971717Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68193 2022-11-23T02:34:47.1972093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1972254Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1972693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1972885Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1973255Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:34:47.1973437Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:34:47.1973816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:34:47.1974006Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:34:47.1974256Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:34:47.1974487Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:34:47.1974898Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1975301Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:34:47.1975537Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:34:47.1975771Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:34:47.1975885Z dist init r=0, world=2 2022-11-23T02:34:47.1975995Z dist init r=1, world=2 2022-11-23T02:34:47.1976096Z ok (4.512s) 2022-11-23T02:34:47.1976119Z 2022-11-23T02:34:47.1976393Z ---------------------------------------------------------------------- 2022-11-23T02:34:47.1976493Z Ran 53 tests in 249.011s 2022-11-23T02:34:47.1976512Z 2022-11-23T02:34:47.1976607Z OK 2022-11-23T02:34:47.1976626Z 2022-11-23T02:34:47.1976752Z Generating XML reports... 2022-11-23T02:34:47.1977218Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_optim_state/TEST-TestFSDPOptimState-20221123023037.xml 2022-11-23T02:34:47.1977237Z 2022-11-23T02:34:47.1977639Z ##[endgroup] 2022-11-23T02:34:47.1978141Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_optim_state (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_optim_state_ch4des5k) 2022-11-23T02:34:47.1978161Z 2022-11-23T02:34:47.1978415Z Running distributed/test_c10d_gloo ... [2022-11-23 02:34:47.130854] 2022-11-23T02:34:47.1978892Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/test_c10d_gloo.py', '-v', '--subprocess', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:34:47.131172] 2022-11-23T02:48:45.6744331Z 2022-11-23T02:48:45.6744842Z Expand the folded group to see the log file of distributed/test_c10d_gloo 2022-11-23T02:48:45.6747661Z ##[group]PRINTING LOG FILE of distributed/test_c10d_gloo (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_gloo_7u35nqr_) 2022-11-23T02:48:45.6748288Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsk07g8jo 2022-11-23T02:48:45.6748852Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsk07g8jo/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6750965Z , <__main__.CommTest testMethod=test_broadcast_coalesced_gloo_cuda>, <__main__.CommTest testMethod=test_gloo_barrier_device_ids>, <__main__.CommTest testMethod=test_gloo_rank_membership>, <__main__.CommTest testMethod=test_gloo_warn_not_in_group>, <__main__.CommTest testMethod=test_sequence_num_incremented_gloo_default>, <__main__.CommTest testMethod=test_sequence_num_incremented_gloo_subgroup>, <__main__.CommTest testMethod=test_sequence_num_set_default_pg_gloo>, <__main__.CommTest testMethod=test_sequence_num_set_gloo_new_group>, <__main__.CommTest testMethod=test_tensor_dtype_complex>, <__main__.CommTest testMethod=test_tensor_dtype_mismatch>]> 2022-11-23T02:48:45.6752269Z test_broadcast_coalesced_gloo_cpu (__main__.CommTest) 2022-11-23T02:48:45.6752631Z test_broadcast_coalesced_gloo_cuda (__main__.CommTest) 2022-11-23T02:48:45.6752975Z test_gloo_barrier_device_ids (__main__.CommTest) 2022-11-23T02:48:45.6753299Z test_gloo_rank_membership (__main__.CommTest) 2022-11-23T02:48:45.6753632Z test_gloo_warn_not_in_group (__main__.CommTest) 2022-11-23T02:48:45.6754773Z test_sequence_num_incremented_gloo_default (__main__.CommTest) 2022-11-23T02:48:45.6755147Z test_sequence_num_incremented_gloo_subgroup (__main__.CommTest) 2022-11-23T02:48:45.6755528Z test_sequence_num_set_default_pg_gloo (__main__.CommTest) 2022-11-23T02:48:45.6755903Z test_sequence_num_set_gloo_new_group (__main__.CommTest) 2022-11-23T02:48:45.6756251Z test_tensor_dtype_complex (__main__.CommTest) 2022-11-23T02:48:45.6756568Z test_tensor_dtype_mismatch (__main__.CommTest) 2022-11-23T02:48:45.6757828Z , <__main__.CompilerTest testMethod=test_allgather_work_wait_gpu>, <__main__.CompilerTest testMethod=test_allreduce_work_wait_cpu>, <__main__.CompilerTest testMethod=test_allreduce_work_wait_gpu>, <__main__.CompilerTest testMethod=test_broadcast_work_wait_cpu>, <__main__.CompilerTest testMethod=test_broadcast_work_wait_gpu>, <__main__.CompilerTest testMethod=test_consecutive_comm_work_wait_cpu>, <__main__.CompilerTest testMethod=test_consecutive_comm_work_wait_gpu>, <__main__.CompilerTest testMethod=test_nested_comm_tensor_wrapping>, <__main__.CompilerTest testMethod=test_scatter_work_wait_cpu>, <__main__.CompilerTest testMethod=test_scatter_work_wait_gpu>]> 2022-11-23T02:48:45.6759090Z test_allgather_work_wait_cpu (__main__.CompilerTest) 2022-11-23T02:48:45.6759450Z test_allgather_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:48:45.6759803Z test_allreduce_work_wait_cpu (__main__.CompilerTest) 2022-11-23T02:48:45.6760135Z test_allreduce_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:48:45.6760489Z test_broadcast_work_wait_cpu (__main__.CompilerTest) 2022-11-23T02:48:45.6760837Z test_broadcast_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:48:45.6761179Z test_consecutive_comm_work_wait_cpu (__main__.CompilerTest) 2022-11-23T02:48:45.6761563Z test_consecutive_comm_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:48:45.6761930Z test_nested_comm_tensor_wrapping (__main__.CompilerTest) 2022-11-23T02:48:45.6762288Z test_scatter_work_wait_cpu (__main__.CompilerTest) 2022-11-23T02:48:45.6762615Z test_scatter_work_wait_gpu (__main__.CompilerTest) 2022-11-23T02:48:45.6768383Z , <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_dynamic_weight_sharing>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_once_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_once_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_static_graph_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_static_graph_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_twice_weight_sharing>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_unused_params_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_unused_params_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_weight_sharing_use_reentrant_False>, <__main__.DistributedDataParallelTest testMethod=test_ddp_checkpointing_weight_sharing_use_reentrant_True>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_future_passing_cpu>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_future_passing_gpu_gloo>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_register_just_once>, <__main__.DistributedDataParallelTest testMethod=test_ddp_comm_hook_sparse_gradients>, <__main__.DistributedDataParallelTest testMethod=test_ddp_invalid_comm_hook_init>, <__main__.DistributedDataParallelTest testMethod=test_ddp_invalid_comm_hook_return_type>, <__main__.DistributedDataParallelTest testMethod=test_find_unused_parameters_when_unused_parameters_empty>, <__main__.DistributedDataParallelTest testMethod=test_global_local_unused_params_grad>, <__main__.DistributedDataParallelTest testMethod=test_global_local_unused_params_grad_with_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_global_local_unused_params_grad_with_static_graph>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_1gpu_module_device_ids_integer_list>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_1gpu_module_device_ids_torch_device_list>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_2gpu_module>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_4gpu_module>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_cpu_module>, <__main__.DistributedDataParallelTest testMethod=test_gloo_backend_cpu_module_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_ignored_output>, <__main__.DistributedDataParallelTest testMethod=test_ignored_output_with_unused_parameters>, <__main__.DistributedDataParallelTest testMethod=test_ignored_sharded_tensor>, <__main__.DistributedDataParallelTest testMethod=test_invalid_powerSGD_state>, <__main__.DistributedDataParallelTest testMethod=test_save_load_checkpoint>, <__main__.DistributedDataParallelTest testMethod=test_sparse_gradients>, <__main__.DistributedDataParallelTest testMethod=test_sparse_gradients_grad_is_view>, <__main__.DistributedDataParallelTest testMethod=test_sync_batch_norm_empty_input>, <__main__.DistributedDataParallelTest testMethod=test_sync_batch_norm_only_empty_input>]> 2022-11-23T02:48:45.6774070Z test_ddp_checkpointing_dynamic_module (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6774595Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6775097Z test_ddp_checkpointing_once_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6775575Z test_ddp_checkpointing_once_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6776092Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6776628Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6777142Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6777619Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6778114Z test_ddp_checkpointing_twice_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6778621Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6779191Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6779705Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6780225Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6780997Z test_ddp_comm_hook_future_passing_cpu (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6781470Z test_ddp_comm_hook_future_passing_gpu_gloo (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6781919Z test_ddp_comm_hook_register_just_once (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6782371Z test_ddp_comm_hook_sparse_gradients (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6782877Z test_ddp_invalid_comm_hook_init (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6783310Z test_ddp_invalid_comm_hook_return_type (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6783806Z test_find_unused_parameters_when_unused_parameters_empty (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6784297Z test_global_local_unused_params_grad (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6784776Z test_global_local_unused_params_grad_with_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6785264Z test_global_local_unused_params_grad_with_static_graph (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6785766Z test_gloo_backend_1gpu_module_device_ids_integer_list (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6786281Z test_gloo_backend_1gpu_module_device_ids_torch_device_list (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6786739Z test_gloo_backend_2gpu_module (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6787174Z test_gloo_backend_4gpu_module (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6787615Z test_gloo_backend_cpu_module (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6788078Z test_gloo_backend_cpu_module_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6788497Z test_ignored_output (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6789399Z test_ignored_output_with_unused_parameters (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6789873Z test_ignored_sharded_tensor (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6790281Z test_invalid_powerSGD_state (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6790708Z test_save_load_checkpoint (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6791127Z test_sparse_gradients (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6791561Z test_sparse_gradients_grad_is_view (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6791996Z test_sync_batch_norm_empty_input (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6792445Z test_sync_batch_norm_only_empty_input (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.6793553Z , <__main__.GlooProcessGroupWithDispatchedCollectivesTests testMethod=test_allreduce_coalesced>, <__main__.GlooProcessGroupWithDispatchedCollectivesTests testMethod=test_collectives>, <__main__.GlooProcessGroupWithDispatchedCollectivesTests testMethod=test_monitored_barrier>]> 2022-11-23T02:48:45.6794692Z test_allgather_coalesced (__main__.GlooProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:48:45.6795236Z test_allreduce_coalesced (__main__.GlooProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:48:45.6795775Z test_collectives (__main__.GlooProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:48:45.6796321Z test_monitored_barrier (__main__.GlooProcessGroupWithDispatchedCollectivesTests) 2022-11-23T02:48:45.6796756Z 2022-11-23T02:48:45.6802126Z , <__main__.ProcessGroupGlooTest testMethod=test_allgather_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_coalesced_async>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_coalesced_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_noncontiguous_input>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_stress>, <__main__.ProcessGroupGlooTest testMethod=test_allgather_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics_cuda_using_work_api>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_basics_using_work_api>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_async>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_basics>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_checks>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_checks_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_coalesced_stress>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_stress>, <__main__.ProcessGroupGlooTest testMethod=test_allreduce_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_barrier_implies_wait>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_basics>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_checks>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_stress>, <__main__.ProcessGroupGlooTest testMethod=test_broadcast_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_empty_tensors>, <__main__.ProcessGroupGlooTest testMethod=test_gather_basics>, <__main__.ProcessGroupGlooTest testMethod=test_gather_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_gather_checks>, <__main__.ProcessGroupGlooTest testMethod=test_gather_noncontiguous_input>, <__main__.ProcessGroupGlooTest testMethod=test_gather_stress>, <__main__.ProcessGroupGlooTest testMethod=test_gather_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_multi_device_constructor>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_basics>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_checks>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_stress>, <__main__.ProcessGroupGlooTest testMethod=test_reduce_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_round_robin>, <__main__.ProcessGroupGlooTest testMethod=test_round_robin_create_destroy>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_basics>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_checks>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_stress>, <__main__.ProcessGroupGlooTest testMethod=test_scatter_stress_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_send_recv_all_to_all>, <__main__.ProcessGroupGlooTest testMethod=test_sparse_allreduce_basics>, <__main__.ProcessGroupGlooTest testMethod=test_sparse_allreduce_basics_cuda>, <__main__.ProcessGroupGlooTest testMethod=test_sparse_allreduce_checks>]> 2022-11-23T02:48:45.6807452Z test_allgather_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6807845Z test_allgather_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6808231Z test_allgather_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6808605Z test_allgather_coalesced_async (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6809017Z test_allgather_coalesced_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6809440Z test_allgather_noncontiguous_input (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6809826Z test_allgather_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6810212Z test_allgather_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6810635Z test_allreduce_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6811012Z test_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6811430Z test_allreduce_basics_cuda_using_work_api (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6811858Z test_allreduce_basics_using_work_api (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6812255Z test_allreduce_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6812626Z test_allreduce_coalesced_async (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6813029Z test_allreduce_coalesced_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6813431Z test_allreduce_coalesced_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6813886Z test_allreduce_coalesced_checks_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6814306Z test_allreduce_coalesced_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6814693Z test_allreduce_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6815079Z test_allreduce_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6815447Z test_barrier_implies_wait (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6815822Z test_broadcast_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6816201Z test_broadcast_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6816557Z test_broadcast_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6816929Z test_broadcast_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6817310Z test_broadcast_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6817664Z test_empty_tensors (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6818032Z test_gather_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6818407Z test_gather_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6818778Z test_gather_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6819151Z test_gather_noncontiguous_input (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6819537Z test_gather_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6819910Z test_gather_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6820287Z test_multi_device_constructor (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6820661Z test_reduce_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6821032Z test_reduce_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6821379Z test_reduce_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6821737Z test_reduce_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6822113Z test_reduce_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6822463Z test_round_robin (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6822853Z test_round_robin_create_destroy (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6823231Z test_scatter_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6823604Z test_scatter_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6823962Z test_scatter_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6824329Z test_scatter_stress (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6824694Z test_scatter_stress_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6825060Z test_send_recv_all_to_all (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6825450Z test_sparse_allreduce_basics (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6825852Z test_sparse_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6826242Z test_sparse_allreduce_checks (__main__.ProcessGroupGlooTest) 2022-11-23T02:48:45.6827130Z , <__main__.ReducerTest testMethod=test_forward_backward_optimizer>, <__main__.ReducerTest testMethod=test_forward_backward_unused_parameters>, <__main__.ReducerTest testMethod=test_multi_dtype_multi_bucket>, <__main__.ReducerTest testMethod=test_multi_dtype_single_bucket>, <__main__.ReducerTest testMethod=test_single_dtype_single_bucket>]> 2022-11-23T02:48:45.6827999Z test_forward_backward (__main__.ReducerTest) 2022-11-23T02:48:45.6828365Z test_forward_backward_optimizer (__main__.ReducerTest) 2022-11-23T02:48:45.6828749Z test_forward_backward_unused_parameters (__main__.ReducerTest) 2022-11-23T02:48:45.6829562Z test_multi_dtype_multi_bucket (__main__.ReducerTest) 2022-11-23T02:48:45.6829925Z test_multi_dtype_single_bucket (__main__.ReducerTest) 2022-11-23T02:48:45.6830282Z test_single_dtype_single_bucket (__main__.ReducerTest) 2022-11-23T02:48:45.6830698Z ]> 2022-11-23T02:48:45.6831118Z test_logging_init (__main__.RendezvousEnvTest) 2022-11-23T02:48:45.6831553Z 2022-11-23T02:48:45.6831968Z ]> 2022-11-23T02:48:45.6832399Z test_default_store_timeout_gloo (__main__.TimeoutTest) 2022-11-23T02:48:45.6833109Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6833580Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6834155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6834640Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6835124Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpifjt38y4 2022-11-23T02:48:45.6835682Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpifjt38y4/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6835983Z 2022-11-23T02:48:45.6836098Z Running tests... 2022-11-23T02:48:45.6836514Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6837057Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.6837546Z test_broadcast_coalesced_gloo_cpu (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.6838027Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68343 2022-11-23T02:48:45.6838490Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68344 2022-11-23T02:48:45.6861316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6861856Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6862493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6862983Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6863586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6864051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6864642Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6865111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6865588Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4d_jc62b 2022-11-23T02:48:45.6866140Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4d_jc62b/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6866663Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.6867153Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl577lfsa 2022-11-23T02:48:45.6867708Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl577lfsa/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6868227Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.6868707Z ok (4.025s) 2022-11-23T02:48:45.6868887Z 2022-11-23T02:48:45.6869671Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6870017Z Ran 1 test in 4.026s 2022-11-23T02:48:45.6870181Z 2022-11-23T02:48:45.6870280Z OK 2022-11-23T02:48:45.6870398Z 2022-11-23T02:48:45.6870524Z Generating XML reports... 2022-11-23T02:48:45.6871083Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023450.xml 2022-11-23T02:48:45.6871763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6872213Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6872923Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6873416Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6873900Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcuwvti3s 2022-11-23T02:48:45.6874433Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcuwvti3s/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6874750Z 2022-11-23T02:48:45.6874859Z Running tests... 2022-11-23T02:48:45.6875277Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6875805Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.6876300Z test_broadcast_coalesced_gloo_cuda (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.6876783Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68452 2022-11-23T02:48:45.6877259Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68453 2022-11-23T02:48:45.6877862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6878335Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6878928Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6879415Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6879991Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6880445Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6881031Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6881500Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6881979Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdfcm49tz 2022-11-23T02:48:45.6882538Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdfcm49tz/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6883069Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.6883561Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmptmntz528 2022-11-23T02:48:45.6884114Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmptmntz528/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6884630Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.6884982Z ok (5.562s) 2022-11-23T02:48:45.6885117Z 2022-11-23T02:48:45.6885399Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6885741Z Ran 1 test in 5.562s 2022-11-23T02:48:45.6885909Z 2022-11-23T02:48:45.6886004Z OK 2022-11-23T02:48:45.6886147Z 2022-11-23T02:48:45.6886255Z Generating XML reports... 2022-11-23T02:48:45.6886803Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023457.xml 2022-11-23T02:48:45.6887550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6888022Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6888591Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6889070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6889544Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp099bif3i 2022-11-23T02:48:45.6890075Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp099bif3i/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6890436Z 2022-11-23T02:48:45.6890547Z Running tests... 2022-11-23T02:48:45.6890954Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6891490Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.6891958Z test_gloo_barrier_device_ids (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.6892427Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68563 2022-11-23T02:48:45.6892888Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68564 2022-11-23T02:48:45.6893487Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6893941Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6894520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6894999Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6895570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6896029Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6896606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6897063Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6897536Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpalgxpo0_ 2022-11-23T02:48:45.6898092Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpalgxpo0_/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6898607Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.6899096Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3h_pdyyh 2022-11-23T02:48:45.6899637Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3h_pdyyh/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6900154Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.6900647Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.6901134Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.6901803Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.6902506Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.6902885Z ok (3.973s) 2022-11-23T02:48:45.6903034Z 2022-11-23T02:48:45.6903303Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6903634Z Ran 1 test in 3.974s 2022-11-23T02:48:45.6903796Z 2022-11-23T02:48:45.6903890Z OK 2022-11-23T02:48:45.6904006Z 2022-11-23T02:48:45.6904130Z Generating XML reports... 2022-11-23T02:48:45.6904727Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023505.xml 2022-11-23T02:48:45.6905406Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6905847Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6906429Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6906905Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6907379Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg5mdmo9b 2022-11-23T02:48:45.6907976Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg5mdmo9b/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6908286Z 2022-11-23T02:48:45.6908394Z Running tests... 2022-11-23T02:48:45.6908803Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6909792Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.6910274Z test_gloo_rank_membership (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.6910741Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68672 2022-11-23T02:48:45.6911200Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68673 2022-11-23T02:48:45.6911799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6912259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6912844Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6913332Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6913908Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6914357Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6914935Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6915394Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6915861Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqqei6_7o 2022-11-23T02:48:45.6916409Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqqei6_7o/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6916950Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp14dl8yew 2022-11-23T02:48:45.6917481Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp14dl8yew/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6917999Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.6918481Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.6918956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.6919464Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.6920133Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.6920830Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.6921355Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:45.6921861Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:45.6922606Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:45.6923321Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:45.6923701Z ok (3.973s) 2022-11-23T02:48:45.6923850Z 2022-11-23T02:48:45.6924119Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6924446Z Ran 1 test in 3.973s 2022-11-23T02:48:45.6924608Z 2022-11-23T02:48:45.6924683Z OK 2022-11-23T02:48:45.6924816Z 2022-11-23T02:48:45.6924941Z Generating XML reports... 2022-11-23T02:48:45.6925485Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023511.xml 2022-11-23T02:48:45.6926239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6926680Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6927266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6927745Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6928199Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7glolsle 2022-11-23T02:48:45.6928747Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7glolsle/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6929058Z 2022-11-23T02:48:45.6929167Z Running tests... 2022-11-23T02:48:45.6929574Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6930091Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.6930571Z test_gloo_warn_not_in_group (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.6931034Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68784 2022-11-23T02:48:45.6931480Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68785 2022-11-23T02:48:45.6932099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6932557Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6933138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6933598Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6934181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6934635Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6935211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6935663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6936138Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4ptkuged 2022-11-23T02:48:45.6936690Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4ptkuged/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6937213Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpga906zxa 2022-11-23T02:48:45.6937753Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpga906zxa/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6938271Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.6938748Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.6939230Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.6939736Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.6940462Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.6941177Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.6941704Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:45.6942207Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:45.6942869Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:45.6943545Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:45.6943999Z ok (5.592s) 2022-11-23T02:48:45.6944150Z 2022-11-23T02:48:45.6944422Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6944756Z Ran 1 test in 5.592s 2022-11-23T02:48:45.6944901Z 2022-11-23T02:48:45.6944992Z OK 2022-11-23T02:48:45.6945127Z 2022-11-23T02:48:45.6945252Z Generating XML reports... 2022-11-23T02:48:45.6945802Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023517.xml 2022-11-23T02:48:45.6946456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6946909Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6947489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6947964Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6948410Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0dtq355y 2022-11-23T02:48:45.6949322Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0dtq355y/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6949641Z 2022-11-23T02:48:45.6949750Z Running tests... 2022-11-23T02:48:45.6950160Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6950675Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.6951180Z test_sequence_num_incremented_gloo_default (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.6951663Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 68898 2022-11-23T02:48:45.6952105Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 68899 2022-11-23T02:48:45.6952717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6953177Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6953768Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6954233Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6954818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6955268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6955829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6956304Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6956775Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfdyd8xpa 2022-11-23T02:48:45.6957328Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfdyd8xpa/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6957851Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4mm8mgts 2022-11-23T02:48:45.6958520Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4mm8mgts/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6959050Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.6959510Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.6960003Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.6960504Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.6961176Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.6961989Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.6962531Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:45.6963042Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:45.6963704Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:45.6964384Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:45.6964780Z ok (5.632s) 2022-11-23T02:48:45.6964933Z 2022-11-23T02:48:45.6965204Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6965538Z Ran 1 test in 5.633s 2022-11-23T02:48:45.6965682Z 2022-11-23T02:48:45.6965781Z OK 2022-11-23T02:48:45.6965915Z 2022-11-23T02:48:45.6966040Z Generating XML reports... 2022-11-23T02:48:45.6966592Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023525.xml 2022-11-23T02:48:45.6967251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6967706Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6968289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6968769Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6969227Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnab557dm 2022-11-23T02:48:45.6969775Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnab557dm/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6970079Z 2022-11-23T02:48:45.6970195Z Running tests... 2022-11-23T02:48:45.6970585Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6971115Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.6971633Z test_sequence_num_incremented_gloo_subgroup (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.6972122Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69015 2022-11-23T02:48:45.6972567Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69016 2022-11-23T02:48:45.6973175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6973633Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6974199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6974679Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6975268Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6975725Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6976348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6976832Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6977305Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv9vj7fi3 2022-11-23T02:48:45.6977837Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv9vj7fi3/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6978375Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_iu47cwh 2022-11-23T02:48:45.6978913Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_iu47cwh/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6979478Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.6979941Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.6980336Z skip: Need at least 4 CUDA devices (3.946s) 2022-11-23T02:48:45.6980534Z 2022-11-23T02:48:45.6980808Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6981135Z Ran 1 test in 3.946s 2022-11-23T02:48:45.6981282Z 2022-11-23T02:48:45.6981389Z OK (skipped=1) 2022-11-23T02:48:45.6981546Z 2022-11-23T02:48:45.6981668Z Generating XML reports... 2022-11-23T02:48:45.6982214Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023533.xml 2022-11-23T02:48:45.6982867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6983319Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6983905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6984385Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6984844Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0peje5m2 2022-11-23T02:48:45.6985391Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0peje5m2/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6985696Z 2022-11-23T02:48:45.6985804Z Running tests... 2022-11-23T02:48:45.6986193Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6986725Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.6987222Z test_sequence_num_set_default_pg_gloo (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.6987704Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69118 2022-11-23T02:48:45.6988150Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69119 2022-11-23T02:48:45.6988760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6989624Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6990200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6990678Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6991262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.6991713Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.6992278Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.6992756Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.6993227Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpilyjn1t4 2022-11-23T02:48:45.6993845Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpilyjn1t4/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6994382Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.6994892Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprv6iob5b 2022-11-23T02:48:45.6995438Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprv6iob5b/_remote_module_non_scriptable.py 2022-11-23T02:48:45.6995938Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.6996434Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.6997018Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.6997690Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.6998376Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.6998780Z ok (4.025s) 2022-11-23T02:48:45.6998932Z 2022-11-23T02:48:45.6999202Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.6999514Z Ran 1 test in 4.026s 2022-11-23T02:48:45.6999679Z 2022-11-23T02:48:45.6999771Z OK 2022-11-23T02:48:45.6999904Z 2022-11-23T02:48:45.7000030Z Generating XML reports... 2022-11-23T02:48:45.7000582Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023540.xml 2022-11-23T02:48:45.7001238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7001699Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7002282Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7002743Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7003213Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4yj0afh5 2022-11-23T02:48:45.7003767Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4yj0afh5/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7004073Z 2022-11-23T02:48:45.7004182Z Running tests... 2022-11-23T02:48:45.7004572Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7005102Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7005597Z test_sequence_num_set_gloo_new_group (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7006059Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69227 2022-11-23T02:48:45.7006519Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69228 2022-11-23T02:48:45.7007138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7007595Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7008162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7008639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7009223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7009676Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7010238Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7010713Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7011189Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps8nhhenw 2022-11-23T02:48:45.7011779Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps8nhhenw/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7012309Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7012816Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn96bt1mm 2022-11-23T02:48:45.7013360Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn96bt1mm/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7013855Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7014350Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7014909Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7015583Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7016275Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7016820Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:48:45.7017326Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:48:45.7017964Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:45.7018653Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:48:45.7019056Z ok (4.082s) 2022-11-23T02:48:45.7019205Z 2022-11-23T02:48:45.7019472Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7019781Z Ran 1 test in 4.082s 2022-11-23T02:48:45.7019943Z 2022-11-23T02:48:45.7020036Z OK 2022-11-23T02:48:45.7020175Z 2022-11-23T02:48:45.7020299Z Generating XML reports... 2022-11-23T02:48:45.7020827Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023546.xml 2022-11-23T02:48:45.7021492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7021951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7022531Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7022993Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7023465Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_uqlx3kc 2022-11-23T02:48:45.7024009Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_uqlx3kc/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7024312Z 2022-11-23T02:48:45.7024421Z Running tests... 2022-11-23T02:48:45.7024812Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7025346Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7025824Z test_tensor_dtype_complex (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7026270Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69342 2022-11-23T02:48:45.7026725Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69343 2022-11-23T02:48:45.7027337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7027796Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7028359Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7028891Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7029852Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7030285Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7030865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7031336Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7031805Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwtc7als_ 2022-11-23T02:48:45.7032333Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwtc7als_/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7032958Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7033467Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe6xlvxip 2022-11-23T02:48:45.7033996Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe6xlvxip/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7034510Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7035009Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7035516Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7036168Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7036869Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7037276Z ok (3.965s) 2022-11-23T02:48:45.7037426Z 2022-11-23T02:48:45.7037696Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7038004Z Ran 1 test in 3.965s 2022-11-23T02:48:45.7038169Z 2022-11-23T02:48:45.7038263Z OK 2022-11-23T02:48:45.7038399Z 2022-11-23T02:48:45.7038524Z Generating XML reports... 2022-11-23T02:48:45.7039051Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023552.xml 2022-11-23T02:48:45.7039721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7040180Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7040764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7041225Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7041706Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvpwsuu2_ 2022-11-23T02:48:45.7042258Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvpwsuu2_/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7042567Z 2022-11-23T02:48:45.7042658Z Running tests... 2022-11-23T02:48:45.7043068Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7043602Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7044082Z test_tensor_dtype_mismatch (__main__.CommTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7044527Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69451 2022-11-23T02:48:45.7044986Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69452 2022-11-23T02:48:45.7045600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7046104Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7046697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7047248Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7047848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7048278Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7048850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7049319Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7049769Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2n0_plj4 2022-11-23T02:48:45.7050377Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2n0_plj4/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7050899Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7051411Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcfkc3fqf 2022-11-23T02:48:45.7051938Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcfkc3fqf/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7052454Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7052946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7053448Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7054104Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7054809Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7055885Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7056527Z warnings.warn( 2022-11-23T02:48:45.7057393Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7058068Z warnings.warn( 2022-11-23T02:48:45.7058947Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7059576Z warnings.warn( 2022-11-23T02:48:45.7060440Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7061064Z warnings.warn( 2022-11-23T02:48:45.7061304Z ok (4.035s) 2022-11-23T02:48:45.7061455Z 2022-11-23T02:48:45.7061724Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7062036Z Ran 1 test in 4.035s 2022-11-23T02:48:45.7062198Z 2022-11-23T02:48:45.7062292Z OK 2022-11-23T02:48:45.7062426Z 2022-11-23T02:48:45.7062549Z Generating XML reports... 2022-11-23T02:48:45.7063077Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023559.xml 2022-11-23T02:48:45.7063757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7064219Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7064858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7065332Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7065805Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxlly9t2v 2022-11-23T02:48:45.7066353Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxlly9t2v/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7066661Z 2022-11-23T02:48:45.7066769Z Running tests... 2022-11-23T02:48:45.7067159Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7067691Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7068239Z test_allgather_work_wait_cpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7068696Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69560 2022-11-23T02:48:45.7069559Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69561 2022-11-23T02:48:45.7070186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7070644Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7071209Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7071687Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7072272Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7072711Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7073287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7073760Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7074229Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaqdggvje 2022-11-23T02:48:45.7074759Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaqdggvje/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7075280Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7075787Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp23qe2ati 2022-11-23T02:48:45.7076308Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp23qe2ati/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7076826Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7077323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7077832Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7078489Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7079191Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7080136Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7080872Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7081727Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7082540Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7082890Z ok (4.061s) 2022-11-23T02:48:45.7083041Z 2022-11-23T02:48:45.7083312Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7083630Z Ran 1 test in 4.061s 2022-11-23T02:48:45.7083792Z 2022-11-23T02:48:45.7083886Z OK 2022-11-23T02:48:45.7084020Z 2022-11-23T02:48:45.7084145Z Generating XML reports... 2022-11-23T02:48:45.7084695Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023605.xml 2022-11-23T02:48:45.7085379Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7085913Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7086498Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7086960Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7087432Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3rtpibxe 2022-11-23T02:48:45.7087983Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3rtpibxe/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7088291Z 2022-11-23T02:48:45.7088400Z Running tests... 2022-11-23T02:48:45.7088790Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7089323Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7089814Z test_allgather_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7090274Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69669 2022-11-23T02:48:45.7090730Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69670 2022-11-23T02:48:45.7091353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7091812Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7092376Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7092857Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7093444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7093874Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7094457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7094932Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7095405Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4qxcddzm 2022-11-23T02:48:45.7095935Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4qxcddzm/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7096457Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7096965Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi7r8bw2k 2022-11-23T02:48:45.7097493Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi7r8bw2k/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7098004Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7098500Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7099009Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7099717Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7100434Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7101374Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7102103Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7102953Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7103735Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7104067Z ok (5.554s) 2022-11-23T02:48:45.7104222Z 2022-11-23T02:48:45.7104491Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7104805Z Ran 1 test in 5.554s 2022-11-23T02:48:45.7104966Z 2022-11-23T02:48:45.7105059Z OK 2022-11-23T02:48:45.7105196Z 2022-11-23T02:48:45.7105319Z Generating XML reports... 2022-11-23T02:48:45.7105865Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023611.xml 2022-11-23T02:48:45.7106547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7107004Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7107585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7108046Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7108521Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqrdipugc 2022-11-23T02:48:45.7109328Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqrdipugc/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7109649Z 2022-11-23T02:48:45.7109758Z Running tests... 2022-11-23T02:48:45.7110158Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7110694Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7111188Z test_allreduce_work_wait_cpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7111650Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69780 2022-11-23T02:48:45.7112115Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69781 2022-11-23T02:48:45.7112737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7113202Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7113773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7114251Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7114834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7115264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7115847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7116317Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7116795Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpslegwz2_ 2022-11-23T02:48:45.7117330Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpslegwz2_/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7117952Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphl092_g8 2022-11-23T02:48:45.7118512Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphl092_g8/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7119027Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7119488Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7119981Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7120492Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7121225Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7121927Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7122866Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7123594Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7124461Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7126809Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7127687Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7128409Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7129266Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7129970Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7130296Z ok (4.082s) 2022-11-23T02:48:45.7130447Z 2022-11-23T02:48:45.7130717Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7131053Z Ran 1 test in 4.082s 2022-11-23T02:48:45.7131200Z 2022-11-23T02:48:45.7131292Z OK 2022-11-23T02:48:45.7131426Z 2022-11-23T02:48:45.7131550Z Generating XML reports... 2022-11-23T02:48:45.7132114Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023619.xml 2022-11-23T02:48:45.7132779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7133236Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7133817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7134294Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7134750Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsp2adcde 2022-11-23T02:48:45.7135305Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsp2adcde/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7135611Z 2022-11-23T02:48:45.7135720Z Running tests... 2022-11-23T02:48:45.7136111Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7136707Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7137208Z test_allreduce_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7137685Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 69889 2022-11-23T02:48:45.7138126Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 69890 2022-11-23T02:48:45.7138740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7139200Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7139784Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7140303Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7140897Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7141350Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7141916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7142391Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7142864Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo2z3kaz9 2022-11-23T02:48:45.7143417Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo2z3kaz9/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7143939Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgs48__fg 2022-11-23T02:48:45.7144484Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgs48__fg/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7144997Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7145461Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7145957Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7146467Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7147141Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7147829Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7148765Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7149774Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7150645Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7151372Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7152220Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7152952Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7153894Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7154637Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7154956Z ok (5.556s) 2022-11-23T02:48:45.7155105Z 2022-11-23T02:48:45.7155372Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7155703Z Ran 1 test in 5.556s 2022-11-23T02:48:45.7155864Z 2022-11-23T02:48:45.7155938Z OK 2022-11-23T02:48:45.7156072Z 2022-11-23T02:48:45.7156196Z Generating XML reports... 2022-11-23T02:48:45.7156759Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023626.xml 2022-11-23T02:48:45.7157523Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7157960Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7158589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7159064Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7159521Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpics2ecsg 2022-11-23T02:48:45.7160069Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpics2ecsg/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7160379Z 2022-11-23T02:48:45.7160488Z Running tests... 2022-11-23T02:48:45.7160894Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7161418Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7161912Z test_broadcast_work_wait_cpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7162391Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70000 2022-11-23T02:48:45.7162837Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70001 2022-11-23T02:48:45.7163455Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7163910Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7164494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7164954Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7165534Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7165983Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7166570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7167032Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7167505Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp40128opx 2022-11-23T02:48:45.7168048Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp40128opx/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7168546Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7169051Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp982_eb9k 2022-11-23T02:48:45.7169588Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp982_eb9k/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7170096Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7170574Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7171083Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7171807Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7172525Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7173445Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7174179Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7175043Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7175830Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7176147Z ok (4.072s) 2022-11-23T02:48:45.7176296Z 2022-11-23T02:48:45.7176566Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7176895Z Ran 1 test in 4.072s 2022-11-23T02:48:45.7177056Z 2022-11-23T02:48:45.7177130Z OK 2022-11-23T02:48:45.7177266Z 2022-11-23T02:48:45.7177392Z Generating XML reports... 2022-11-23T02:48:45.7177956Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023634.xml 2022-11-23T02:48:45.7178637Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7179079Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7179659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7180134Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7180608Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp17tnzdlw 2022-11-23T02:48:45.7181140Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp17tnzdlw/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7181447Z 2022-11-23T02:48:45.7181557Z Running tests... 2022-11-23T02:48:45.7181962Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7182480Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7182970Z test_broadcast_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7183446Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70109 2022-11-23T02:48:45.7183909Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70110 2022-11-23T02:48:45.7184513Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7184971Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7185553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7186016Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7186600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7187052Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7187631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7188092Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7188565Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn8q4kvpj 2022-11-23T02:48:45.7189461Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn8q4kvpj/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7190011Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn7ks7qjs 2022-11-23T02:48:45.7190562Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn7ks7qjs/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7191082Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7191566Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7192040Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7192793Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7193339Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7194003Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7194930Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7195669Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7196532Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7197264Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7197580Z ok (5.471s) 2022-11-23T02:48:45.7197733Z 2022-11-23T02:48:45.7198006Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7198345Z Ran 1 test in 5.472s 2022-11-23T02:48:45.7198508Z 2022-11-23T02:48:45.7198601Z OK 2022-11-23T02:48:45.7198719Z 2022-11-23T02:48:45.7198843Z Generating XML reports... 2022-11-23T02:48:45.7199409Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023640.xml 2022-11-23T02:48:45.7200090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7200531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7201114Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7201595Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7202064Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu3cuyzwt 2022-11-23T02:48:45.7202597Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu3cuyzwt/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7202906Z 2022-11-23T02:48:45.7203017Z Running tests... 2022-11-23T02:48:45.7203422Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7203938Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7204440Z test_consecutive_comm_work_wait_cpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7204923Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70220 2022-11-23T02:48:45.7205382Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70221 2022-11-23T02:48:45.7205983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7206442Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7207080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7207549Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7208134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7208583Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7209161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7209617Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7210151Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfvl_o4_i 2022-11-23T02:48:45.7210699Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfvl_o4_i/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7211220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7211711Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi0oa4728 2022-11-23T02:48:45.7212259Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi0oa4728/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7212771Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7213249Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7213752Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7214428Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7215130Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7216055Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7216790Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7217654Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7218369Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7219237Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant2 target _tensor_constant2 _tensor_constant2 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7219952Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7220821Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant3 target _tensor_constant3 _tensor_constant3 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7221543Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7222399Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7223111Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7224020Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7224747Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7225606Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant2 target _tensor_constant2 _tensor_constant2 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7226324Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7227222Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant3 target _tensor_constant3 _tensor_constant3 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7227949Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7228281Z ok (4.046s) 2022-11-23T02:48:45.7228432Z 2022-11-23T02:48:45.7228702Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7229245Z Ran 1 test in 4.046s 2022-11-23T02:48:45.7229455Z 2022-11-23T02:48:45.7229548Z OK 2022-11-23T02:48:45.7229685Z 2022-11-23T02:48:45.7229811Z Generating XML reports... 2022-11-23T02:48:45.7230366Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023648.xml 2022-11-23T02:48:45.7231048Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7231508Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7232079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7232560Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7233035Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbhjrsdki 2022-11-23T02:48:45.7233592Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbhjrsdki/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7233901Z 2022-11-23T02:48:45.7233991Z Running tests... 2022-11-23T02:48:45.7234396Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7234927Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7235429Z test_consecutive_comm_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7235901Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70329 2022-11-23T02:48:45.7236358Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70330 2022-11-23T02:48:45.7236974Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7237414Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7237998Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7238474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7239056Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7239485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7240061Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7240537Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7240990Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyq842743 2022-11-23T02:48:45.7241615Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyq842743/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7242143Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7242648Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7wqw19w9 2022-11-23T02:48:45.7243170Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7wqw19w9/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7243682Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7244177Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7244770Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7245428Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7246131Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7247071Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7247807Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7248652Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7249386Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7250251Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7250978Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7251823Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7252539Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7253405Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant2 target _tensor_constant2 _tensor_constant2 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7254133Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7254991Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant2 target _tensor_constant2 _tensor_constant2 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7255699Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7256553Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant3 target _tensor_constant3 _tensor_constant3 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7257267Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7258233Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant3 target _tensor_constant3 _tensor_constant3 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7258953Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7259281Z ok (5.461s) 2022-11-23T02:48:45.7259433Z 2022-11-23T02:48:45.7259705Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7260034Z Ran 1 test in 5.461s 2022-11-23T02:48:45.7260178Z 2022-11-23T02:48:45.7260270Z OK 2022-11-23T02:48:45.7260404Z 2022-11-23T02:48:45.7260528Z Generating XML reports... 2022-11-23T02:48:45.7261090Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023654.xml 2022-11-23T02:48:45.7261817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7262273Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7262860Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7263339Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7263796Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppq9l97yd 2022-11-23T02:48:45.7264346Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppq9l97yd/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7264654Z 2022-11-23T02:48:45.7264763Z Running tests... 2022-11-23T02:48:45.7265155Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7265687Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7266187Z test_nested_comm_tensor_wrapping (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7266671Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70440 2022-11-23T02:48:45.7267115Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70441 2022-11-23T02:48:45.7267727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7268185Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7268746Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7269483Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7270079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7270537Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7271100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7271577Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7272050Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd_2g1lhr 2022-11-23T02:48:45.7272597Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd_2g1lhr/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7273097Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7273606Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdcmfi3n0 2022-11-23T02:48:45.7274155Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdcmfi3n0/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7274651Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7275149Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7275656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7276410Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7277113Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7278050Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7278788Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7279724Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7280456Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7281305Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7282030Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7282889Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant1 target _tensor_constant1 _tensor_constant1 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7283619Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7283932Z ok (4.079s) 2022-11-23T02:48:45.7284083Z 2022-11-23T02:48:45.7284354Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7284693Z Ran 1 test in 4.079s 2022-11-23T02:48:45.7284856Z 2022-11-23T02:48:45.7284930Z OK 2022-11-23T02:48:45.7285065Z 2022-11-23T02:48:45.7285189Z Generating XML reports... 2022-11-23T02:48:45.7285754Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023702.xml 2022-11-23T02:48:45.7286439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7286880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7287463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7287945Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7288403Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg_cynlri 2022-11-23T02:48:45.7288953Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg_cynlri/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7289259Z 2022-11-23T02:48:45.7289368Z Running tests... 2022-11-23T02:48:45.7289772Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7290295Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7290780Z test_scatter_work_wait_cpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7291252Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70549 2022-11-23T02:48:45.7291694Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70550 2022-11-23T02:48:45.7292316Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7292772Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7293398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7293844Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7294429Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7294905Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7295504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7295965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7296438Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnxmdd3rk 2022-11-23T02:48:45.7297048Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnxmdd3rk/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7297575Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpchg0cdkc 2022-11-23T02:48:45.7298122Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpchg0cdkc/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7298645Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7299127Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7299601Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7300105Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7300775Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7301481Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7302408Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7303139Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7304004Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7304725Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7305043Z ok (3.925s) 2022-11-23T02:48:45.7305191Z 2022-11-23T02:48:45.7305460Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7305788Z Ran 1 test in 3.925s 2022-11-23T02:48:45.7305952Z 2022-11-23T02:48:45.7306025Z OK 2022-11-23T02:48:45.7306159Z 2022-11-23T02:48:45.7306286Z Generating XML reports... 2022-11-23T02:48:45.7306849Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023709.xml 2022-11-23T02:48:45.7307533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7307970Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7308548Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7309254Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7309745Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpir1_eh8k 2022-11-23T02:48:45.7310278Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpir1_eh8k/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7310583Z 2022-11-23T02:48:45.7310693Z Running tests... 2022-11-23T02:48:45.7311206Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7311746Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7312235Z test_scatter_work_wait_gpu (__main__.CompilerTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7312699Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70658 2022-11-23T02:48:45.7313157Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70659 2022-11-23T02:48:45.7313757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7314286Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7314872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7315331Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7315919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7316373Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7316949Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7317400Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7317866Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkz_iosp_ 2022-11-23T02:48:45.7318421Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkz_iosp_/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7318930Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7319439Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqixez0gz 2022-11-23T02:48:45.7319988Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqixez0gz/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7320502Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7320980Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7321483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7322149Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7322854Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7323784Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7324522Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7325385Z /opt/conda/lib/python3.10/site-packages/torch/fx/graph.py:1346: UserWarning: Node _tensor_constant0 target _tensor_constant0 _tensor_constant0 of does not reference an nn.Module, nn.Parameter, or buffer, which is what 'get_attr' Nodes typically target 2022-11-23T02:48:45.7326111Z warnings.warn(f'Node {node} target {node.target} {atom} of {seen_qualname} does ' 2022-11-23T02:48:45.7326425Z ok (5.535s) 2022-11-23T02:48:45.7326577Z 2022-11-23T02:48:45.7326844Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7327163Z Ran 1 test in 5.536s 2022-11-23T02:48:45.7327324Z 2022-11-23T02:48:45.7327410Z OK 2022-11-23T02:48:45.7327528Z 2022-11-23T02:48:45.7327645Z Generating XML reports... 2022-11-23T02:48:45.7328254Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023715.xml 2022-11-23T02:48:45.7328939Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7329376Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7329955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7330430Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7330901Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpacxw6ng2 2022-11-23T02:48:45.7331160Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpacxw6ng2/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7331235Z 2022-11-23T02:48:45.7331347Z Running tests... 2022-11-23T02:48:45.7331621Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7331947Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7332178Z test_ddp_checkpointing_dynamic_module (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7332551Z Dynamic module can be checkpointed, multiple times, with non-reentrant ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7332774Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70769 2022-11-23T02:48:45.7332978Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70770 2022-11-23T02:48:45.7333358Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7333536Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7333927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7334123Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7334499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7334677Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7335053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7335226Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7335490Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb9sokhti 2022-11-23T02:48:45.7335765Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb9sokhti/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7336006Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7336263Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplzej5540 2022-11-23T02:48:45.7336533Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplzej5540/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7336767Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7336873Z ok (6.179s) 2022-11-23T02:48:45.7336893Z 2022-11-23T02:48:45.7337163Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7337257Z Ran 1 test in 6.179s 2022-11-23T02:48:45.7337277Z 2022-11-23T02:48:45.7337369Z OK 2022-11-23T02:48:45.7337388Z 2022-11-23T02:48:45.7337513Z Generating XML reports... 2022-11-23T02:48:45.7337981Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023723.xml 2022-11-23T02:48:45.7338365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7338543Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7338983Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7339185Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7339424Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5wmt735r 2022-11-23T02:48:45.7339695Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5wmt735r/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7339715Z 2022-11-23T02:48:45.7339823Z Running tests... 2022-11-23T02:48:45.7340092Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7340407Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7340703Z test_ddp_checkpointing_dynamic_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7340976Z Dynamic module can be checkpointed multiple times with weight sharing ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7341202Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70884 2022-11-23T02:48:45.7341406Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 70885 2022-11-23T02:48:45.7341786Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7341964Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7342351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7342545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7342916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7343098Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7343480Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7343674Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7343916Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu8u3889p 2022-11-23T02:48:45.7344189Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu8u3889p/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7344446Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp86qw4puj 2022-11-23T02:48:45.7344716Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp86qw4puj/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7344949Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7345182Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7345285Z ok (6.026s) 2022-11-23T02:48:45.7345305Z 2022-11-23T02:48:45.7345575Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7345673Z Ran 1 test in 6.026s 2022-11-23T02:48:45.7345711Z 2022-11-23T02:48:45.7345838Z OK 2022-11-23T02:48:45.7345858Z 2022-11-23T02:48:45.7345986Z Generating XML reports... 2022-11-23T02:48:45.7346455Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023731.xml 2022-11-23T02:48:45.7346832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7347012Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7347397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7347597Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7347858Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8spjzi3q 2022-11-23T02:48:45.7348170Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8spjzi3q/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7348211Z 2022-11-23T02:48:45.7348305Z Running tests... 2022-11-23T02:48:45.7348574Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7348890Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7349369Z test_ddp_checkpointing_once_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7349633Z DDP works as expected when layer is checkpointed only once. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7349858Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 70999 2022-11-23T02:48:45.7350179Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71000 2022-11-23T02:48:45.7350547Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7350730Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7351118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7351310Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7351680Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7351857Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7352239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7352435Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7352694Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2fl8von6 2022-11-23T02:48:45.7352951Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2fl8von6/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7353210Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpteay1hfr 2022-11-23T02:48:45.7353482Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpteay1hfr/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7353713Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7353945Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7354187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7354426Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7354663Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7354881Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7355815Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:48:45.7355928Z warnings.warn( 2022-11-23T02:48:45.7356853Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:48:45.7356968Z warnings.warn( 2022-11-23T02:48:45.7357209Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7357501Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7357745Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7357981Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7358263Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7358496Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7358704Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7358986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7359089Z ok (6.170s) 2022-11-23T02:48:45.7359110Z 2022-11-23T02:48:45.7359384Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7359497Z Ran 1 test in 6.170s 2022-11-23T02:48:45.7359520Z 2022-11-23T02:48:45.7359613Z OK 2022-11-23T02:48:45.7359632Z 2022-11-23T02:48:45.7359758Z Generating XML reports... 2022-11-23T02:48:45.7360227Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023740.xml 2022-11-23T02:48:45.7360586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7360766Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7361156Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7361355Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7361619Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqer4m5ps 2022-11-23T02:48:45.7361898Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqer4m5ps/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7361921Z 2022-11-23T02:48:45.7362030Z Running tests... 2022-11-23T02:48:45.7362294Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7362609Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7362834Z test_ddp_checkpointing_once_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7363087Z DDP works as expected when layer is checkpointed only once. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7363309Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71114 2022-11-23T02:48:45.7363531Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71115 2022-11-23T02:48:45.7363913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7364090Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7364483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7364680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7365032Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7365207Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7365587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7365778Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7366045Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpumaquof5 2022-11-23T02:48:45.7366324Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpumaquof5/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7366633Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp8k9o6hq 2022-11-23T02:48:45.7366916Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp8k9o6hq/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7367150Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7367359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7367600Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7367839Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7368124Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7368362Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7369289Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:48:45.7369403Z warnings.warn( 2022-11-23T02:48:45.7370325Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:48:45.7370440Z warnings.warn( 2022-11-23T02:48:45.7370678Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7370899Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7371139Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7371374Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7371604Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7371831Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7372065Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7372297Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7372403Z ok (6.159s) 2022-11-23T02:48:45.7372423Z 2022-11-23T02:48:45.7372675Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7372785Z Ran 1 test in 6.160s 2022-11-23T02:48:45.7372805Z 2022-11-23T02:48:45.7372897Z OK 2022-11-23T02:48:45.7372917Z 2022-11-23T02:48:45.7373045Z Generating XML reports... 2022-11-23T02:48:45.7373516Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023748.xml 2022-11-23T02:48:45.7373893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7374073Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7374461Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7374656Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7374902Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp52qivqbf 2022-11-23T02:48:45.7375176Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp52qivqbf/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7375196Z 2022-11-23T02:48:45.7375351Z Running tests... 2022-11-23T02:48:45.7375634Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7375949Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7376219Z test_ddp_checkpointing_twice_static_graph_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7376571Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7376794Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71229 2022-11-23T02:48:45.7376998Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71230 2022-11-23T02:48:45.7377427Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7377607Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7377997Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7378191Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7378559Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7378735Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7379118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7379311Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7379553Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgqe9qyff 2022-11-23T02:48:45.7379833Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgqe9qyff/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7380092Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqw319w3t 2022-11-23T02:48:45.7380363Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqw319w3t/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7380598Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7380830Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7381071Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7381309Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7381527Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7381762Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7381865Z ok (6.043s) 2022-11-23T02:48:45.7381884Z 2022-11-23T02:48:45.7382156Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7382274Z Ran 1 test in 6.044s 2022-11-23T02:48:45.7382293Z 2022-11-23T02:48:45.7382386Z OK 2022-11-23T02:48:45.7382405Z 2022-11-23T02:48:45.7382531Z Generating XML reports... 2022-11-23T02:48:45.7382999Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023757.xml 2022-11-23T02:48:45.7383378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7383539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7383927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7384127Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7384385Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpedt034gy 2022-11-23T02:48:45.7384705Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpedt034gy/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7384727Z 2022-11-23T02:48:45.7384841Z Running tests... 2022-11-23T02:48:45.7385112Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7385426Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7385674Z test_ddp_checkpointing_twice_static_graph_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7386024Z Regardless of reentrant or non-reentrant checkpointing impl, ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7386246Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71344 2022-11-23T02:48:45.7386516Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71345 2022-11-23T02:48:45.7386892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7387074Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7387462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7387656Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7388028Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7388185Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7388565Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7388759Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7389253Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr83qj4s1 2022-11-23T02:48:45.7389540Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr83qj4s1/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7389773Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7390031Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv00j7oez 2022-11-23T02:48:45.7390299Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv00j7oez/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7390512Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7390753Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7390990Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7391227Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7391461Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7391565Z ok (6.014s) 2022-11-23T02:48:45.7391589Z 2022-11-23T02:48:45.7391870Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7391986Z Ran 1 test in 6.014s 2022-11-23T02:48:45.7392005Z 2022-11-23T02:48:45.7392078Z OK 2022-11-23T02:48:45.7392117Z 2022-11-23T02:48:45.7392223Z Generating XML reports... 2022-11-23T02:48:45.7392690Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023805.xml 2022-11-23T02:48:45.7393067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7393247Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7393638Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7393834Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7394170Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy0k4cvda 2022-11-23T02:48:45.7394460Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy0k4cvda/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7394481Z 2022-11-23T02:48:45.7394571Z Running tests... 2022-11-23T02:48:45.7394842Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7395158Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7395405Z test_ddp_checkpointing_twice_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7395785Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7396078Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71459 2022-11-23T02:48:45.7396299Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71460 2022-11-23T02:48:45.7396683Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7396862Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7397232Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7397427Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7397796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7397972Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7398358Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7398550Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7398817Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphoko97fd 2022-11-23T02:48:45.7399095Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphoko97fd/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7399334Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiv4eilmj 2022-11-23T02:48:45.7399608Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiv4eilmj/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7399841Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7400071Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7400316Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7400562Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7401368Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:48:45.7402166Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:48:45.7402461Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7402711Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7402815Z ok (6.164s) 2022-11-23T02:48:45.7402835Z 2022-11-23T02:48:45.7403109Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7403203Z Ran 1 test in 6.165s 2022-11-23T02:48:45.7403223Z 2022-11-23T02:48:45.7403316Z OK 2022-11-23T02:48:45.7403336Z 2022-11-23T02:48:45.7403463Z Generating XML reports... 2022-11-23T02:48:45.7403931Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023813.xml 2022-11-23T02:48:45.7404367Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7404547Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7404937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7405133Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7405393Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwm0b45o8 2022-11-23T02:48:45.7405650Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwm0b45o8/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7405670Z 2022-11-23T02:48:45.7405780Z Running tests... 2022-11-23T02:48:45.7406049Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7406364Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7406613Z test_ddp_checkpointing_twice_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7406994Z Checkpoitning twice fails for non-static graph with reentrant checkpoint ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7407221Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71574 2022-11-23T02:48:45.7407440Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71575 2022-11-23T02:48:45.7407801Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7407980Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7408366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7408558Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7408933Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7409108Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7409491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7409684Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7409942Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5r7f58sd 2022-11-23T02:48:45.7410198Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5r7f58sd/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7410429Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7410691Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo10m0uht 2022-11-23T02:48:45.7410962Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo10m0uht/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7411197Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7411299Z ok (6.053s) 2022-11-23T02:48:45.7411319Z 2022-11-23T02:48:45.7411639Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7411760Z Ran 1 test in 6.054s 2022-11-23T02:48:45.7411780Z 2022-11-23T02:48:45.7411853Z OK 2022-11-23T02:48:45.7411872Z 2022-11-23T02:48:45.7411996Z Generating XML reports... 2022-11-23T02:48:45.7412465Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023822.xml 2022-11-23T02:48:45.7412840Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7413016Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7413399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7413660Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7413919Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg064l0bh 2022-11-23T02:48:45.7414194Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg064l0bh/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7414214Z 2022-11-23T02:48:45.7414305Z Running tests... 2022-11-23T02:48:45.7414573Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7414889Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7415130Z test_ddp_checkpointing_twice_weight_sharing (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7415405Z Checkpointing should work with static graph in the case of checkpointing ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7415629Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71689 2022-11-23T02:48:45.7415852Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71690 2022-11-23T02:48:45.7416231Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7416393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7416785Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7416979Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7417349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7417522Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7417901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7418095Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7418357Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0odrxoje 2022-11-23T02:48:45.7418633Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0odrxoje/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7418846Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7419107Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7vsz0ei1 2022-11-23T02:48:45.7419431Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7vsz0ei1/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7419663Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7419902Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7420140Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7420382Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7420620Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7420752Z ok (6.047s) 2022-11-23T02:48:45.7420774Z 2022-11-23T02:48:45.7421055Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7421169Z Ran 1 test in 6.047s 2022-11-23T02:48:45.7421188Z 2022-11-23T02:48:45.7421279Z OK 2022-11-23T02:48:45.7421298Z 2022-11-23T02:48:45.7421422Z Generating XML reports... 2022-11-23T02:48:45.7421885Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023830.xml 2022-11-23T02:48:45.7422259Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7422436Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7422873Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7423049Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7423311Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnii01e6f 2022-11-23T02:48:45.7423585Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnii01e6f/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7423604Z 2022-11-23T02:48:45.7423712Z Running tests... 2022-11-23T02:48:45.7423981Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7424294Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7424558Z test_ddp_checkpointing_unused_params_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7424833Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7425040Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71804 2022-11-23T02:48:45.7425261Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71805 2022-11-23T02:48:45.7425641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7425820Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7426206Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7426401Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7426772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7426951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7427337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7427511Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7427776Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1r_8tf3a 2022-11-23T02:48:45.7428048Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1r_8tf3a/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7428281Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7428537Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdczru3ys 2022-11-23T02:48:45.7428805Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdczru3ys/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7429266Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7430159Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:48:45.7430966Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:48:45.7431972Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:48:45.7432088Z warnings.warn( 2022-11-23T02:48:45.7433019Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:48:45.7433132Z warnings.warn( 2022-11-23T02:48:45.7433377Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7433598Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7433837Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7434077Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7434179Z ok (6.184s) 2022-11-23T02:48:45.7434200Z 2022-11-23T02:48:45.7434472Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7434585Z Ran 1 test in 6.184s 2022-11-23T02:48:45.7434604Z 2022-11-23T02:48:45.7434697Z OK 2022-11-23T02:48:45.7434716Z 2022-11-23T02:48:45.7434842Z Generating XML reports... 2022-11-23T02:48:45.7435288Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023839.xml 2022-11-23T02:48:45.7435668Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7435852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7436244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7436441Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7436701Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdake1k39 2022-11-23T02:48:45.7436974Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdake1k39/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7436993Z 2022-11-23T02:48:45.7437102Z Running tests... 2022-11-23T02:48:45.7437351Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7437667Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7437926Z test_ddp_checkpointing_unused_params_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7438205Z With reentrant autograd checkpointing impl, DDP will fail when there are ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7438430Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 71919 2022-11-23T02:48:45.7438702Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 71920 2022-11-23T02:48:45.7439093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7439272Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7439658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7439833Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7440202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7440430Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7440816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7441012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7441277Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxbu2vduh 2022-11-23T02:48:45.7441554Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxbu2vduh/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7441811Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqgr81sg2 2022-11-23T02:48:45.7442086Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqgr81sg2/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7442302Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7442537Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7443473Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:48:45.7443588Z warnings.warn( 2022-11-23T02:48:45.7444514Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:48:45.7444630Z warnings.warn( 2022-11-23T02:48:45.7444871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7445103Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7445345Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7445582Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7445665Z ok (6.058s) 2022-11-23T02:48:45.7445685Z 2022-11-23T02:48:45.7445955Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7446067Z Ran 1 test in 6.058s 2022-11-23T02:48:45.7446086Z 2022-11-23T02:48:45.7446178Z OK 2022-11-23T02:48:45.7446197Z 2022-11-23T02:48:45.7446320Z Generating XML reports... 2022-11-23T02:48:45.7446788Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023847.xml 2022-11-23T02:48:45.7447175Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7447354Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7447775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7447979Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7448241Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprldh1xi7 2022-11-23T02:48:45.7448516Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprldh1xi7/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7448537Z 2022-11-23T02:48:45.7448646Z Running tests... 2022-11-23T02:48:45.7448919Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7449237Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7449555Z test_ddp_checkpointing_weight_sharing_use_reentrant_False (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7449795Z Test that checkpointing with weight sharing works. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7450003Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72034 2022-11-23T02:48:45.7450225Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72035 2022-11-23T02:48:45.7450606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7450786Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7451174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7451368Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7451736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7451912Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7452276Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7452472Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7452733Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphhi36386 2022-11-23T02:48:45.7453006Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphhi36386/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7453239Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7453496Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpagygejw9 2022-11-23T02:48:45.7453771Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpagygejw9/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7454009Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7454250Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7454477Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7454711Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7454942Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7455044Z ok (6.162s) 2022-11-23T02:48:45.7455063Z 2022-11-23T02:48:45.7455333Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7455447Z Ran 1 test in 6.162s 2022-11-23T02:48:45.7455466Z 2022-11-23T02:48:45.7455559Z OK 2022-11-23T02:48:45.7455578Z 2022-11-23T02:48:45.7455704Z Generating XML reports... 2022-11-23T02:48:45.7456156Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023856.xml 2022-11-23T02:48:45.7456536Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7456762Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7457158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7457352Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7457610Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd_045avk 2022-11-23T02:48:45.7457880Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd_045avk/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7457900Z 2022-11-23T02:48:45.7458011Z Running tests... 2022-11-23T02:48:45.7458322Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7458680Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7458944Z test_ddp_checkpointing_weight_sharing_use_reentrant_True (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7459188Z Test that checkpointing with weight sharing works. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7459410Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72149 2022-11-23T02:48:45.7459630Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72150 2022-11-23T02:48:45.7460007Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7460186Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7460577Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7460751Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7461125Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7461301Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7461684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7461876Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7462138Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw4t71bet 2022-11-23T02:48:45.7462408Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw4t71bet/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7462665Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprilb6d34 2022-11-23T02:48:45.7462934Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprilb6d34/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7463152Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7463384Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7463625Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7463863Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7464100Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7464338Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7464571Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7464799Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7465012Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7465251Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7465354Z ok (6.049s) 2022-11-23T02:48:45.7465373Z 2022-11-23T02:48:45.7465690Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7465811Z Ran 1 test in 6.049s 2022-11-23T02:48:45.7465830Z 2022-11-23T02:48:45.7465923Z OK 2022-11-23T02:48:45.7465942Z 2022-11-23T02:48:45.7466066Z Generating XML reports... 2022-11-23T02:48:45.7466538Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023904.xml 2022-11-23T02:48:45.7466898Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7467078Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7467465Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7467711Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7467972Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgdymz48x 2022-11-23T02:48:45.7468247Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgdymz48x/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7468266Z 2022-11-23T02:48:45.7468375Z Running tests... 2022-11-23T02:48:45.7468645Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7469192Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7469419Z test_ddp_comm_hook_future_passing_cpu (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7469694Z This unit test verifies whether the Future object is passed properly. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7469918Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72264 2022-11-23T02:48:45.7470143Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72265 2022-11-23T02:48:45.7470527Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7470709Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7471098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7471292Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7471645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7471820Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7472207Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7472404Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7472663Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf_vq8m4s 2022-11-23T02:48:45.7472935Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf_vq8m4s/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7473192Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbdjqadgv 2022-11-23T02:48:45.7473467Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbdjqadgv/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7473699Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7473913Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7474018Z ok (3.975s) 2022-11-23T02:48:45.7474038Z 2022-11-23T02:48:45.7474315Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7474434Z Ran 1 test in 3.975s 2022-11-23T02:48:45.7474454Z 2022-11-23T02:48:45.7474546Z OK 2022-11-23T02:48:45.7474565Z 2022-11-23T02:48:45.7474691Z Generating XML reports... 2022-11-23T02:48:45.7475286Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023913.xml 2022-11-23T02:48:45.7475685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7475845Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7476235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7476429Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7476689Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx0mnawfo 2022-11-23T02:48:45.7476964Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx0mnawfo/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7477053Z 2022-11-23T02:48:45.7477169Z Running tests... 2022-11-23T02:48:45.7477443Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7477762Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7477999Z test_ddp_comm_hook_future_passing_gpu_gloo (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7478276Z This unit test verifies whether the Future object is passed properly using gloo backend. ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7478498Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72377 2022-11-23T02:48:45.7478718Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72378 2022-11-23T02:48:45.7479094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7479274Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7479659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7479854Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7480229Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7480386Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7480767Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7480960Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7481223Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpptzi5540 2022-11-23T02:48:45.7481497Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpptzi5540/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7481759Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp95np1m5k 2022-11-23T02:48:45.7482030Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp95np1m5k/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7482266Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7482496Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7482579Z ok (5.431s) 2022-11-23T02:48:45.7482599Z 2022-11-23T02:48:45.7482870Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7482981Z Ran 1 test in 5.431s 2022-11-23T02:48:45.7483001Z 2022-11-23T02:48:45.7483092Z OK 2022-11-23T02:48:45.7483110Z 2022-11-23T02:48:45.7483235Z Generating XML reports... 2022-11-23T02:48:45.7483700Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023919.xml 2022-11-23T02:48:45.7484080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7484258Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7484676Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7484878Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7485139Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiugwo2sa 2022-11-23T02:48:45.7485413Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiugwo2sa/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7485433Z 2022-11-23T02:48:45.7485541Z Running tests... 2022-11-23T02:48:45.7485812Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7486129Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7486402Z test_ddp_comm_hook_register_just_once (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7486685Z DDP communication hook can only be registered once. This test validates whether ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7486891Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72492 2022-11-23T02:48:45.7487111Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72493 2022-11-23T02:48:45.7487489Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7487668Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7488053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7488248Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7488621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7488794Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7489162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7489350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7489609Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa5nfhu_n 2022-11-23T02:48:45.7489886Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa5nfhu_n/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7490118Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7490378Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpycb4ujgk 2022-11-23T02:48:45.7490650Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpycb4ujgk/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7490885Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7490987Z ok (3.958s) 2022-11-23T02:48:45.7491007Z 2022-11-23T02:48:45.7491260Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7491375Z Ran 1 test in 3.958s 2022-11-23T02:48:45.7491394Z 2022-11-23T02:48:45.7491484Z OK 2022-11-23T02:48:45.7491503Z 2022-11-23T02:48:45.7491627Z Generating XML reports... 2022-11-23T02:48:45.7492095Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023927.xml 2022-11-23T02:48:45.7492470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7492651Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7493035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7493213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7493472Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps_lqwcu7 2022-11-23T02:48:45.7493791Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps_lqwcu7/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7493812Z 2022-11-23T02:48:45.7493925Z Running tests... 2022-11-23T02:48:45.7494193Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7494506Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7494729Z test_ddp_comm_hook_sparse_gradients (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7495011Z Runs "test_sparse_gradients" unit test with DDP communication hook. We define a ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7495233Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72601 2022-11-23T02:48:45.7495506Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72602 2022-11-23T02:48:45.7495889Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7496070Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7496458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7496651Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7497027Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7497203Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7497585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7497762Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7498024Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8w35r834 2022-11-23T02:48:45.7498297Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8w35r834/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7498529Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7498787Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3gkg9stk 2022-11-23T02:48:45.7499058Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3gkg9stk/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7499291Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7499393Z ok (4.026s) 2022-11-23T02:48:45.7499412Z 2022-11-23T02:48:45.7499683Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7499781Z Ran 1 test in 4.027s 2022-11-23T02:48:45.7499801Z 2022-11-23T02:48:45.7499893Z OK 2022-11-23T02:48:45.7499913Z 2022-11-23T02:48:45.7500036Z Generating XML reports... 2022-11-23T02:48:45.7500508Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023933.xml 2022-11-23T02:48:45.7500885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7501061Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7501446Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7501639Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7501877Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx3tn0uji 2022-11-23T02:48:45.7502149Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx3tn0uji/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7502173Z 2022-11-23T02:48:45.7502282Z Running tests... 2022-11-23T02:48:45.7502548Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7502912Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7503135Z test_ddp_invalid_comm_hook_init (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7503412Z This unit test makes sure that register_comm_hook properly checks the format ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7503633Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72744 2022-11-23T02:48:45.7503854Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72745 2022-11-23T02:48:45.7504215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7504394Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7504833Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7505027Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7505401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7505577Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7505954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7506144Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7506384Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3itdceh6 2022-11-23T02:48:45.7506657Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3itdceh6/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7506918Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplx5peig9 2022-11-23T02:48:45.7507191Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplx5peig9/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7507428Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7507661Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7507764Z ok (3.972s) 2022-11-23T02:48:45.7507784Z 2022-11-23T02:48:45.7508053Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7508162Z Ran 1 test in 3.972s 2022-11-23T02:48:45.7508182Z 2022-11-23T02:48:45.7508255Z OK 2022-11-23T02:48:45.7508274Z 2022-11-23T02:48:45.7508397Z Generating XML reports... 2022-11-23T02:48:45.7508864Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023939.xml 2022-11-23T02:48:45.7509499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7509680Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7510071Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7510266Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7510525Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdvxqqw1x 2022-11-23T02:48:45.7510778Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdvxqqw1x/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7510817Z 2022-11-23T02:48:45.7510906Z Running tests... 2022-11-23T02:48:45.7511176Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7511491Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7511721Z test_ddp_invalid_comm_hook_return_type (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7512004Z This test checks whether return annotation checked properly if defined. It also ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7512301Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72853 2022-11-23T02:48:45.7512535Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72854 2022-11-23T02:48:45.7512915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7513076Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7513464Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7513657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7514025Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7514266Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7514654Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7514846Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7515108Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd3rcp545 2022-11-23T02:48:45.7515362Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd3rcp545/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7515594Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7515854Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpft5yp9cu 2022-11-23T02:48:45.7516125Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpft5yp9cu/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7516361Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7516465Z ok (4.080s) 2022-11-23T02:48:45.7516485Z 2022-11-23T02:48:45.7516761Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7516876Z Ran 1 test in 4.080s 2022-11-23T02:48:45.7516895Z 2022-11-23T02:48:45.7516989Z OK 2022-11-23T02:48:45.7517008Z 2022-11-23T02:48:45.7517113Z Generating XML reports... 2022-11-23T02:48:45.7517579Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023946.xml 2022-11-23T02:48:45.7517957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7518135Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7518519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7518716Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7518971Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvbu73c8s 2022-11-23T02:48:45.7519246Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvbu73c8s/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7519267Z 2022-11-23T02:48:45.7519378Z Running tests... 2022-11-23T02:48:45.7519626Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7519943Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7520202Z test_find_unused_parameters_when_unused_parameters_empty (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7520476Z An empty unused_parameters array does not imply find_unused_parameters = ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7520701Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 72966 2022-11-23T02:48:45.7520925Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 72967 2022-11-23T02:48:45.7521304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7521533Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7521910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7522108Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7522481Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7522655Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7523034Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7523282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7523542Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1j5prcyo 2022-11-23T02:48:45.7523823Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1j5prcyo/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7524063Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe5b0epzu 2022-11-23T02:48:45.7524338Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe5b0epzu/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7524573Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7524799Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7525597Z [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:48:45.7525705Z ok (5.535s) 2022-11-23T02:48:45.7525724Z 2022-11-23T02:48:45.7525995Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7526122Z Ran 1 test in 5.535s 2022-11-23T02:48:45.7526141Z 2022-11-23T02:48:45.7526234Z OK 2022-11-23T02:48:45.7526253Z 2022-11-23T02:48:45.7526379Z Generating XML reports... 2022-11-23T02:48:45.7526843Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023952.xml 2022-11-23T02:48:45.7527200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7527381Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7527770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7527966Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7528230Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph82zb_qp 2022-11-23T02:48:45.7528505Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph82zb_qp/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7528525Z 2022-11-23T02:48:45.7528633Z Running tests... 2022-11-23T02:48:45.7528904Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7529199Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7529491Z test_global_local_unused_params_grad (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7529717Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73081 2022-11-23T02:48:45.7529939Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73082 2022-11-23T02:48:45.7530363Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7530550Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7530941Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7531135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7531504Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7531662Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7532094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7532284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7532549Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsrhxtjnf 2022-11-23T02:48:45.7532824Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsrhxtjnf/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7533059Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7533315Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprg8x_pod 2022-11-23T02:48:45.7533585Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprg8x_pod/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7533799Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7533900Z ok (5.575s) 2022-11-23T02:48:45.7533923Z 2022-11-23T02:48:45.7534194Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7534306Z Ran 1 test in 5.575s 2022-11-23T02:48:45.7534325Z 2022-11-23T02:48:45.7534416Z OK 2022-11-23T02:48:45.7534435Z 2022-11-23T02:48:45.7534559Z Generating XML reports... 2022-11-23T02:48:45.7535031Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024000.xml 2022-11-23T02:48:45.7535410Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7535586Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7535953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7536146Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7536402Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7g769miy 2022-11-23T02:48:45.7536675Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7g769miy/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7536695Z 2022-11-23T02:48:45.7536802Z Running tests... 2022-11-23T02:48:45.7537076Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7537393Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7537705Z test_global_local_unused_params_grad_with_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7537908Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73196 2022-11-23T02:48:45.7538129Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73197 2022-11-23T02:48:45.7538507Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7538687Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7539074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7539316Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7539698Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7539873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7540254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7540428Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7540687Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf6z_nv0o 2022-11-23T02:48:45.7540959Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf6z_nv0o/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7541258Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7541518Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpiq924rqr 2022-11-23T02:48:45.7541791Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpiq924rqr/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7542022Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7542124Z ok (5.678s) 2022-11-23T02:48:45.7542143Z 2022-11-23T02:48:45.7542399Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7542513Z Ran 1 test in 5.679s 2022-11-23T02:48:45.7542533Z 2022-11-23T02:48:45.7542626Z OK 2022-11-23T02:48:45.7542645Z 2022-11-23T02:48:45.7542767Z Generating XML reports... 2022-11-23T02:48:45.7543232Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024008.xml 2022-11-23T02:48:45.7543613Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7543791Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7544179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7544374Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7544614Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprb0mcsgt 2022-11-23T02:48:45.7544887Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprb0mcsgt/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7544907Z 2022-11-23T02:48:45.7545014Z Running tests... 2022-11-23T02:48:45.7545286Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7545600Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7545921Z test_global_local_unused_params_grad_with_static_graph (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7546147Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73311 2022-11-23T02:48:45.7546369Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73312 2022-11-23T02:48:45.7546723Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7546900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7547283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7547474Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7547849Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7548028Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7548405Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7548646Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7548916Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpss8ln0jp 2022-11-23T02:48:45.7549406Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpss8ln0jp/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7549668Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4umnr3lc 2022-11-23T02:48:45.7549942Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4umnr3lc/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7550176Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7550507Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7551451Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:48:45.7551568Z warnings.warn( 2022-11-23T02:48:45.7552487Z /opt/conda/lib/python3.10/site-packages/torch/nn/parallel/distributed.py:1862: UserWarning: You passed find_unused_parameters=true to DistributedDataParallel, `_set_static_graph` will detect unused parameters automatically, so you do not need to set find_unused_parameters=true, just be sure these unused parameters will not change during training loop while calling `_set_static_graph`. 2022-11-23T02:48:45.7552601Z warnings.warn( 2022-11-23T02:48:45.7552703Z ok (5.566s) 2022-11-23T02:48:45.7552723Z 2022-11-23T02:48:45.7552973Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7553086Z Ran 1 test in 5.566s 2022-11-23T02:48:45.7553106Z 2022-11-23T02:48:45.7553197Z OK 2022-11-23T02:48:45.7553219Z 2022-11-23T02:48:45.7553345Z Generating XML reports... 2022-11-23T02:48:45.7553816Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024016.xml 2022-11-23T02:48:45.7554192Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7554371Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7554758Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7554951Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7555193Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjq96krxc 2022-11-23T02:48:45.7555466Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjq96krxc/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7555485Z 2022-11-23T02:48:45.7555596Z Running tests... 2022-11-23T02:48:45.7555867Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7556187Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7556499Z test_gloo_backend_1gpu_module_device_ids_integer_list (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7556722Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73426 2022-11-23T02:48:45.7556944Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73427 2022-11-23T02:48:45.7557302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7557485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7557871Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7558174Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7558570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7558748Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7559136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7559331Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7559594Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeb13aqek 2022-11-23T02:48:45.7559903Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeb13aqek/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7560140Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7560399Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp732azk9x 2022-11-23T02:48:45.7560670Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp732azk9x/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7560902Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7561143Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7561383Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7561488Z ok (6.061s) 2022-11-23T02:48:45.7561508Z 2022-11-23T02:48:45.7561762Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7561877Z Ran 1 test in 6.061s 2022-11-23T02:48:45.7561897Z 2022-11-23T02:48:45.7561989Z OK 2022-11-23T02:48:45.7562009Z 2022-11-23T02:48:45.7562133Z Generating XML reports... 2022-11-23T02:48:45.7562605Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024024.xml 2022-11-23T02:48:45.7562984Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7563165Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7563551Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7563746Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7563985Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4mmhy85t 2022-11-23T02:48:45.7564258Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4mmhy85t/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7564281Z 2022-11-23T02:48:45.7564386Z Running tests... 2022-11-23T02:48:45.7564652Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7564966Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7565282Z test_gloo_backend_1gpu_module_device_ids_torch_device_list (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7565502Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73541 2022-11-23T02:48:45.7565716Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73542 2022-11-23T02:48:45.7566074Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7566250Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7566630Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7566820Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7567240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7567422Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7567807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7567994Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7568249Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp29srm6jb 2022-11-23T02:48:45.7568502Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp29srm6jb/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7568754Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp52p5k8nn 2022-11-23T02:48:45.7569073Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp52p5k8nn/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7569300Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7569530Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7569767Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7570007Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7570108Z ok (5.956s) 2022-11-23T02:48:45.7570127Z 2022-11-23T02:48:45.7570382Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7570491Z Ran 1 test in 5.957s 2022-11-23T02:48:45.7570511Z 2022-11-23T02:48:45.7570599Z OK 2022-11-23T02:48:45.7570619Z 2022-11-23T02:48:45.7570738Z Generating XML reports... 2022-11-23T02:48:45.7571208Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024032.xml 2022-11-23T02:48:45.7571580Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7571756Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7572136Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7572326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7572567Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwb7wnl33 2022-11-23T02:48:45.7572836Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwb7wnl33/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7572856Z 2022-11-23T02:48:45.7572961Z Running tests... 2022-11-23T02:48:45.7573219Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7573532Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7573809Z test_gloo_backend_2gpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7574033Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73656 2022-11-23T02:48:45.7574251Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73657 2022-11-23T02:48:45.7574608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7574785Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7575166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7575354Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7575714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7575885Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7576317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7576510Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7576767Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpemi7qkcg 2022-11-23T02:48:45.7577021Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpemi7qkcg/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7577269Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0n0xferx 2022-11-23T02:48:45.7577539Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0n0xferx/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7577766Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7578045Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7578195Z skip: Need at least 4 CUDA devices (3.966s) 2022-11-23T02:48:45.7578215Z 2022-11-23T02:48:45.7578487Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7578599Z Ran 1 test in 3.967s 2022-11-23T02:48:45.7578619Z 2022-11-23T02:48:45.7578707Z OK (skipped=1) 2022-11-23T02:48:45.7578743Z 2022-11-23T02:48:45.7578847Z Generating XML reports... 2022-11-23T02:48:45.7579311Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024040.xml 2022-11-23T02:48:45.7579684Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7579856Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7580239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7580431Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7580691Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprn4t4whq 2022-11-23T02:48:45.7580959Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprn4t4whq/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7580978Z 2022-11-23T02:48:45.7581067Z Running tests... 2022-11-23T02:48:45.7581328Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7581641Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7581919Z test_gloo_backend_4gpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7582135Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73759 2022-11-23T02:48:45.7582356Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73760 2022-11-23T02:48:45.7582727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7582904Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7583287Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7583461Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7583831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7583999Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7584374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7584564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7584815Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp5q5tp7t 2022-11-23T02:48:45.7585086Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp5q5tp7t/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7585387Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj7ibibgl 2022-11-23T02:48:45.7585651Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj7ibibgl/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7585877Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7586102Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7586247Z skip: Need at least 8 CUDA devices (3.916s) 2022-11-23T02:48:45.7586267Z 2022-11-23T02:48:45.7586536Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7586693Z Ran 1 test in 3.916s 2022-11-23T02:48:45.7586713Z 2022-11-23T02:48:45.7586809Z OK (skipped=1) 2022-11-23T02:48:45.7586829Z 2022-11-23T02:48:45.7586948Z Generating XML reports... 2022-11-23T02:48:45.7587417Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024047.xml 2022-11-23T02:48:45.7587775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7587943Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7588317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7588501Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7588753Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8_g59x1j 2022-11-23T02:48:45.7589259Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8_g59x1j/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7589285Z 2022-11-23T02:48:45.7589398Z Running tests... 2022-11-23T02:48:45.7589658Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7589962Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7590239Z test_gloo_backend_cpu_module (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7590454Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73862 2022-11-23T02:48:45.7590670Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73863 2022-11-23T02:48:45.7591045Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7591221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7591600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7591785Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7592161Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7592319Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7592691Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7592875Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7593132Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppfit124y 2022-11-23T02:48:45.7593399Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppfit124y/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7593650Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxto3r2xs 2022-11-23T02:48:45.7593880Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7594150Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxto3r2xs/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7594438Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7594689Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7594922Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7595022Z ok (3.922s) 2022-11-23T02:48:45.7595041Z 2022-11-23T02:48:45.7595307Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7595413Z Ran 1 test in 3.923s 2022-11-23T02:48:45.7595433Z 2022-11-23T02:48:45.7595522Z OK 2022-11-23T02:48:45.7595541Z 2022-11-23T02:48:45.7595659Z Generating XML reports... 2022-11-23T02:48:45.7596190Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024053.xml 2022-11-23T02:48:45.7596549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7596728Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7597101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7597296Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7597545Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1xp5vnns 2022-11-23T02:48:45.7597817Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1xp5vnns/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7597836Z 2022-11-23T02:48:45.7597937Z Running tests... 2022-11-23T02:48:45.7598201Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7598503Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7598798Z test_gloo_backend_cpu_module_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7599016Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 73975 2022-11-23T02:48:45.7599229Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 73976 2022-11-23T02:48:45.7599598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7599765Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7600144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7600331Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7600704Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7600863Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7601246Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7601431Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7601680Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2a5j_s3_ 2022-11-23T02:48:45.7601944Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2a5j_s3_/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7602174Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7602427Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpny05crn0 2022-11-23T02:48:45.7602690Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpny05crn0/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7602908Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7603142Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7603425Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7603530Z ok (4.097s) 2022-11-23T02:48:45.7603550Z 2022-11-23T02:48:45.7603819Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7603926Z Ran 1 test in 4.097s 2022-11-23T02:48:45.7603945Z 2022-11-23T02:48:45.7604029Z OK 2022-11-23T02:48:45.7604048Z 2022-11-23T02:48:45.7604172Z Generating XML reports... 2022-11-23T02:48:45.7604631Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024059.xml 2022-11-23T02:48:45.7604990Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7605229Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7605610Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7605801Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7606052Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxxcnz29q 2022-11-23T02:48:45.7606323Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxxcnz29q/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7606342Z 2022-11-23T02:48:45.7606449Z Running tests... 2022-11-23T02:48:45.7606710Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7607009Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7607204Z test_ignored_output (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7607461Z Test that the output of a model can be ignored and that there is no ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7607677Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74088 2022-11-23T02:48:45.7607891Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74089 2022-11-23T02:48:45.7608265Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7608437Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7608811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7609004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7609357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7609531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7609903Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7610093Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7610349Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5ruijdtv 2022-11-23T02:48:45.7610612Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5ruijdtv/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7610839Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7611092Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcqr5_s4j 2022-11-23T02:48:45.7611343Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcqr5_s4j/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7611568Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7611801Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7612035Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7612134Z ok (4.070s) 2022-11-23T02:48:45.7612220Z 2022-11-23T02:48:45.7612497Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7612605Z Ran 1 test in 4.071s 2022-11-23T02:48:45.7612625Z 2022-11-23T02:48:45.7612715Z OK 2022-11-23T02:48:45.7612734Z 2022-11-23T02:48:45.7612839Z Generating XML reports... 2022-11-23T02:48:45.7613297Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024106.xml 2022-11-23T02:48:45.7613666Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7613840Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7614315Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7614499Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7614758Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpt5sr_5i7 2022-11-23T02:48:45.7615024Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpt5sr_5i7/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7615044Z 2022-11-23T02:48:45.7615141Z Running tests... 2022-11-23T02:48:45.7615393Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7615702Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7615935Z test_ignored_output_with_unused_parameters (__main__.DistributedDataParallelTest) 2022-11-23T02:48:45.7616190Z Test that the output of a model can be ignored and that there is no ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7616410Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74231 2022-11-23T02:48:45.7616625Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74232 2022-11-23T02:48:45.7616999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7617171Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7617539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7617725Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7618094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7618263Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7618643Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7618825Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7619079Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpowjsryk8 2022-11-23T02:48:45.7619349Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpowjsryk8/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7619602Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpavyajl2p 2022-11-23T02:48:45.7619858Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpavyajl2p/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7620089Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7620317Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7620408Z ok (3.920s) 2022-11-23T02:48:45.7620431Z 2022-11-23T02:48:45.7620694Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7620801Z Ran 1 test in 3.920s 2022-11-23T02:48:45.7620820Z 2022-11-23T02:48:45.7620906Z OK 2022-11-23T02:48:45.7620926Z 2022-11-23T02:48:45.7621045Z Generating XML reports... 2022-11-23T02:48:45.7621549Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024112.xml 2022-11-23T02:48:45.7621932Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7622104Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7622483Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7622669Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7622921Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdn5cqdo0 2022-11-23T02:48:45.7623240Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdn5cqdo0/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7623261Z 2022-11-23T02:48:45.7623361Z Running tests... 2022-11-23T02:48:45.7623624Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7623923Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7624200Z test_ignored_sharded_tensor (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7624410Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74374 2022-11-23T02:48:45.7624621Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74375 2022-11-23T02:48:45.7624985Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7625150Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7625532Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7625721Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7626087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7626246Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7626620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7626800Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7627052Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq7ynwz8y 2022-11-23T02:48:45.7627321Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq7ynwz8y/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7627580Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjt9sfyp6 2022-11-23T02:48:45.7627844Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjt9sfyp6/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7628072Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7628286Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7628527Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7628767Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7629501Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7629910Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7630011Z ok (5.558s) 2022-11-23T02:48:45.7630030Z 2022-11-23T02:48:45.7630292Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7630395Z Ran 1 test in 5.558s 2022-11-23T02:48:45.7630415Z 2022-11-23T02:48:45.7630500Z OK 2022-11-23T02:48:45.7630598Z 2022-11-23T02:48:45.7630713Z Generating XML reports... 2022-11-23T02:48:45.7631174Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024118.xml 2022-11-23T02:48:45.7631545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7631721Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7632099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7632290Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7632605Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppl4zomud 2022-11-23T02:48:45.7632873Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppl4zomud/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7632893Z 2022-11-23T02:48:45.7632999Z Running tests... 2022-11-23T02:48:45.7633249Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7633557Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7633827Z test_invalid_powerSGD_state (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7634042Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74485 2022-11-23T02:48:45.7634256Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74486 2022-11-23T02:48:45.7634628Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7634806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7635184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7635362Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7635731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7635900Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7636273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7636457Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7636713Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2tyzjyum 2022-11-23T02:48:45.7636986Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2tyzjyum/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7637212Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7637766Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:48:45.7638316Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:48:45.7638861Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:48:45.7639451Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:48:45.7640007Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:48:45.7640593Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:48:45.7640855Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4x3terpo 2022-11-23T02:48:45.7641110Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4x3terpo/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7641333Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7641876Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:48:45.7642428Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:48:45.7642966Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 0; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:48:45.7643513Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:48:45.7644064Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = True; warm_start = False; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:48:45.7644608Z INFO:torch.distributed.algorithms.ddp_comm_hooks.powerSGD_hook:PowerSGD config: matrix_approximation_rank = 1; start_powerSGD_iter = 1; min_compression_rate = 2; orthogonalization_epsilon = 0; use_error_feedback = False; warm_start = True; random_seed = 0; compression_stats_logging_frequency = 10000; batch_tensors_with_same_shape = False 2022-11-23T02:48:45.7644706Z ok (3.980s) 2022-11-23T02:48:45.7644726Z 2022-11-23T02:48:45.7645003Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7645114Z Ran 1 test in 3.980s 2022-11-23T02:48:45.7645134Z 2022-11-23T02:48:45.7645220Z OK 2022-11-23T02:48:45.7645239Z 2022-11-23T02:48:45.7645356Z Generating XML reports... 2022-11-23T02:48:45.7645917Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024126.xml 2022-11-23T02:48:45.7646294Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7646466Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7646847Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7647040Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7647300Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpklkbyz2y 2022-11-23T02:48:45.7647619Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpklkbyz2y/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7647639Z 2022-11-23T02:48:45.7647742Z Running tests... 2022-11-23T02:48:45.7648012Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7648326Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7648588Z test_save_load_checkpoint (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7648806Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74588 2022-11-23T02:48:45.7649024Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74589 2022-11-23T02:48:45.7649398Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7649570Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7649953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7650143Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7650519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7650680Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7651060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7651246Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7651500Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbhs0o2ey 2022-11-23T02:48:45.7651764Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbhs0o2ey/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7651994Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7652248Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpow_xwdbs 2022-11-23T02:48:45.7652516Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpow_xwdbs/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7652739Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7652970Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7653211Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7653615Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7654010Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:48:45.7654247Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7654479Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7654757Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7654995Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7655079Z ok (6.085s) 2022-11-23T02:48:45.7655112Z 2022-11-23T02:48:45.7655365Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7655472Z Ran 1 test in 6.085s 2022-11-23T02:48:45.7655491Z 2022-11-23T02:48:45.7655576Z OK 2022-11-23T02:48:45.7655595Z 2022-11-23T02:48:45.7655717Z Generating XML reports... 2022-11-23T02:48:45.7656181Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024133.xml 2022-11-23T02:48:45.7656609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7656784Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7657170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7657346Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7657600Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8s4rblxc 2022-11-23T02:48:45.7657868Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8s4rblxc/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7657888Z 2022-11-23T02:48:45.7657989Z Running tests... 2022-11-23T02:48:45.7658296Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7658608Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7658875Z test_sparse_gradients (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7659095Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74703 2022-11-23T02:48:45.7659299Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74704 2022-11-23T02:48:45.7659667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7659836Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7660215Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7660399Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7660763Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7660935Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7661318Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7661504Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7661748Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgz46vq79 2022-11-23T02:48:45.7662016Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgz46vq79/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7662243Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7662490Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp45l0d2uj 2022-11-23T02:48:45.7662752Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp45l0d2uj/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7662979Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7663080Z ok (3.916s) 2022-11-23T02:48:45.7663100Z 2022-11-23T02:48:45.7663368Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7663462Z Ran 1 test in 3.916s 2022-11-23T02:48:45.7663493Z 2022-11-23T02:48:45.7663567Z OK 2022-11-23T02:48:45.7663638Z 2022-11-23T02:48:45.7663763Z Generating XML reports... 2022-11-23T02:48:45.7664223Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024141.xml 2022-11-23T02:48:45.7664596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7664774Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7665149Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7665340Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7665664Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbiehtpyt 2022-11-23T02:48:45.7665920Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbiehtpyt/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7665953Z 2022-11-23T02:48:45.7666047Z Running tests... 2022-11-23T02:48:45.7666311Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7666620Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7666906Z test_sparse_gradients_grad_is_view (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7667121Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74846 2022-11-23T02:48:45.7667339Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74847 2022-11-23T02:48:45.7667713Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7667876Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7668262Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7668457Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7668817Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7669223Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7669620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7669806Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7670060Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc4m9ua7y 2022-11-23T02:48:45.7670332Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc4m9ua7y/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7670547Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7670804Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzzih4a_t 2022-11-23T02:48:45.7671075Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzzih4a_t/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7671302Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7671397Z ok (3.955s) 2022-11-23T02:48:45.7671418Z 2022-11-23T02:48:45.7671683Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7671791Z Ran 1 test in 3.956s 2022-11-23T02:48:45.7671811Z 2022-11-23T02:48:45.7671894Z OK 2022-11-23T02:48:45.7671913Z 2022-11-23T02:48:45.7672018Z Generating XML reports... 2022-11-23T02:48:45.7672478Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024147.xml 2022-11-23T02:48:45.7672857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7673119Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7673514Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7673706Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7673955Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_j9u7m9i 2022-11-23T02:48:45.7674222Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_j9u7m9i/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7674241Z 2022-11-23T02:48:45.7674346Z Running tests... 2022-11-23T02:48:45.7674593Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7674983Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7675258Z test_sync_batch_norm_empty_input (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7675476Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 74989 2022-11-23T02:48:45.7675693Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 74990 2022-11-23T02:48:45.7676063Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7676235Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7676616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7676803Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7677155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7677328Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7677707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7677893Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7678144Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp90z2ih79 2022-11-23T02:48:45.7678409Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp90z2ih79/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7678660Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgj86w_8g 2022-11-23T02:48:45.7678923Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgj86w_8g/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7679136Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7679360Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7679599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7679838Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7679932Z ok (6.871s) 2022-11-23T02:48:45.7679951Z 2022-11-23T02:48:45.7680218Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7680328Z Ran 1 test in 6.871s 2022-11-23T02:48:45.7680348Z 2022-11-23T02:48:45.7680431Z OK 2022-11-23T02:48:45.7680450Z 2022-11-23T02:48:45.7680555Z Generating XML reports... 2022-11-23T02:48:45.7681014Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024154.xml 2022-11-23T02:48:45.7681382Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7681554Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7681927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7682162Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7682425Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7kfznhnc 2022-11-23T02:48:45.7682695Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7kfznhnc/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7682714Z 2022-11-23T02:48:45.7682819Z Running tests... 2022-11-23T02:48:45.7683073Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7683384Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7683673Z test_sync_batch_norm_only_empty_input (__main__.DistributedDataParallelTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7683946Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75104 2022-11-23T02:48:45.7684161Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75105 2022-11-23T02:48:45.7684541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7684711Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7685090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7685279Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7685631Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7685806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7686174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7686361Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7686616Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfr9ixakz 2022-11-23T02:48:45.7686887Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfr9ixakz/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7687115Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7687362Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoig3oy0k 2022-11-23T02:48:45.7687614Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoig3oy0k/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7687837Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7688074Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7688318Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:48:45.7688420Z ok (6.258s) 2022-11-23T02:48:45.7688439Z 2022-11-23T02:48:45.7688713Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7688827Z Ran 1 test in 6.258s 2022-11-23T02:48:45.7688846Z 2022-11-23T02:48:45.7688939Z OK 2022-11-23T02:48:45.7688958Z 2022-11-23T02:48:45.7689063Z Generating XML reports... 2022-11-23T02:48:45.7689528Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024203.xml 2022-11-23T02:48:45.7689904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7690080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7690467Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7690665Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7690925Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzcdfbiid 2022-11-23T02:48:45.7691250Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzcdfbiid/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7691271Z 2022-11-23T02:48:45.7691387Z Running tests... 2022-11-23T02:48:45.7691638Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7691957Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7692300Z test_allgather_coalesced (__main__.GlooProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7692524Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75219 2022-11-23T02:48:45.7692901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7693133Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7693522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7693719Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7693982Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps3wyzs8r 2022-11-23T02:48:45.7694238Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps3wyzs8r/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7694470Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7694719Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7695124Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:48:45.7695889Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7696005Z warnings.warn( 2022-11-23T02:48:45.7696106Z ok (3.872s) 2022-11-23T02:48:45.7696126Z 2022-11-23T02:48:45.7696397Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7696509Z Ran 1 test in 3.872s 2022-11-23T02:48:45.7696529Z 2022-11-23T02:48:45.7696603Z OK 2022-11-23T02:48:45.7696622Z 2022-11-23T02:48:45.7696749Z Generating XML reports... 2022-11-23T02:48:45.7697307Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123024211.xml 2022-11-23T02:48:45.7697686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7697869Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7698260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7698455Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7698714Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp30e3z42l 2022-11-23T02:48:45.7698965Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp30e3z42l/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7699003Z 2022-11-23T02:48:45.7699093Z Running tests... 2022-11-23T02:48:45.7699360Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7699676Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7700017Z test_allreduce_coalesced (__main__.GlooProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7700243Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75291 2022-11-23T02:48:45.7700670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7700855Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7701245Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7701420Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7701681Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkpoauiv1 2022-11-23T02:48:45.7701957Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkpoauiv1/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7702191Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7702497Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7702910Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:48:45.7703666Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7703779Z warnings.warn( 2022-11-23T02:48:45.7703879Z ok (3.851s) 2022-11-23T02:48:45.7703898Z 2022-11-23T02:48:45.7704147Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7704261Z Ran 1 test in 3.852s 2022-11-23T02:48:45.7704280Z 2022-11-23T02:48:45.7704373Z OK 2022-11-23T02:48:45.7704393Z 2022-11-23T02:48:45.7704521Z Generating XML reports... 2022-11-23T02:48:45.7705077Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123024218.xml 2022-11-23T02:48:45.7705458Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7705638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7706024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7706219Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7706463Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppkr1j18_ 2022-11-23T02:48:45.7706738Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppkr1j18_/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7706758Z 2022-11-23T02:48:45.7706864Z Running tests... 2022-11-23T02:48:45.7707137Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7707455Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7707792Z test_collectives (__main__.GlooProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7708017Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75363 2022-11-23T02:48:45.7708396Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7708556Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7709165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7709374Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7709636Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv7uf6h2r 2022-11-23T02:48:45.7709916Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv7uf6h2r/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7710150Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7710475Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7710910Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:48:45.7711014Z ok (3.863s) 2022-11-23T02:48:45.7711034Z 2022-11-23T02:48:45.7711282Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7711397Z Ran 1 test in 3.863s 2022-11-23T02:48:45.7711416Z 2022-11-23T02:48:45.7711507Z OK 2022-11-23T02:48:45.7711526Z 2022-11-23T02:48:45.7711651Z Generating XML reports... 2022-11-23T02:48:45.7712207Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123024224.xml 2022-11-23T02:48:45.7712659Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7712842Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7713285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7713466Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7713769Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjb8ly6td 2022-11-23T02:48:45.7714093Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjb8ly6td/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7714115Z 2022-11-23T02:48:45.7714239Z Running tests... 2022-11-23T02:48:45.7714675Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7715090Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7715466Z test_monitored_barrier (__main__.GlooProcessGroupWithDispatchedCollectivesTests) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7715728Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75435 2022-11-23T02:48:45.7716146Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7716308Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7716747Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7716977Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7717273Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1dj_u04n 2022-11-23T02:48:45.7717635Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1dj_u04n/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7717912Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7718201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7718654Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:48:45.7718806Z ok (3.892s) 2022-11-23T02:48:45.7718827Z 2022-11-23T02:48:45.7719079Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7719229Z Ran 1 test in 3.892s 2022-11-23T02:48:45.7719249Z 2022-11-23T02:48:45.7719376Z OK 2022-11-23T02:48:45.7719395Z 2022-11-23T02:48:45.7719556Z Generating XML reports... 2022-11-23T02:48:45.7720231Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123024230.xml 2022-11-23T02:48:45.7720658Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7720874Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7721361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7721603Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7737114Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp57xifzs7 2022-11-23T02:48:45.7737456Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp57xifzs7/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7737492Z 2022-11-23T02:48:45.7737589Z Running tests... 2022-11-23T02:48:45.7737912Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7738259Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7738642Z test_allgather_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7738872Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75507 2022-11-23T02:48:45.7739108Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75508 2022-11-23T02:48:45.7739331Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 75509 2022-11-23T02:48:45.7739564Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 75510 2022-11-23T02:48:45.7739954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7740143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7740541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7740744Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7741130Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7741314Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7741721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7741915Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7742292Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7742474Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7742875Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7743072Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7743463Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7743641Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7744052Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7744251Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7744513Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo84tsucd 2022-11-23T02:48:45.7744765Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpz0z14trb 2022-11-23T02:48:45.7745049Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo84tsucd/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7745323Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpz0z14trb/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7745564Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7745807Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7746133Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpugx4c10f 2022-11-23T02:48:45.7746417Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpugx4c10f/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7746681Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2dpbv74w 2022-11-23T02:48:45.7746945Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2dpbv74w/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7747184Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7747420Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7747518Z ok (4.058s) 2022-11-23T02:48:45.7747589Z 2022-11-23T02:48:45.7747875Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7747989Z Ran 1 test in 4.058s 2022-11-23T02:48:45.7748009Z 2022-11-23T02:48:45.7748096Z OK 2022-11-23T02:48:45.7748115Z 2022-11-23T02:48:45.7748235Z Generating XML reports... 2022-11-23T02:48:45.7748701Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024236.xml 2022-11-23T02:48:45.7749341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7749528Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7749927Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7750119Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7750373Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7oagpz52 2022-11-23T02:48:45.7750649Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7oagpz52/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7750670Z 2022-11-23T02:48:45.7750767Z Running tests... 2022-11-23T02:48:45.7751035Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7751339Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7751598Z test_allgather_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7751813Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75690 2022-11-23T02:48:45.7752023Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75691 2022-11-23T02:48:45.7752236Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 75692 2022-11-23T02:48:45.7752441Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 75693 2022-11-23T02:48:45.7752823Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7752994Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7753368Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7753554Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7753925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7754090Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7754469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7754654Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7755024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7755195Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7755664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7755851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7756218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7756396Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7756764Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7756941Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7757200Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcxd7ka_3 2022-11-23T02:48:45.7757539Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcxd7ka_3/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7757770Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7758025Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprt2gm9pq 2022-11-23T02:48:45.7758323Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprt2gm9pq/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7758573Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4irsdscj 2022-11-23T02:48:45.7758818Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe8dc8zu3 2022-11-23T02:48:45.7759075Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4irsdscj/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7759342Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe8dc8zu3/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7759571Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7759790Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7760013Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7760097Z ok (5.983s) 2022-11-23T02:48:45.7760117Z 2022-11-23T02:48:45.7760383Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7760492Z Ran 1 test in 5.983s 2022-11-23T02:48:45.7760512Z 2022-11-23T02:48:45.7760600Z OK 2022-11-23T02:48:45.7760619Z 2022-11-23T02:48:45.7760734Z Generating XML reports... 2022-11-23T02:48:45.7761171Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024243.xml 2022-11-23T02:48:45.7761550Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7761727Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7762103Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7762284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7762538Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx3uev16g 2022-11-23T02:48:45.7762800Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx3uev16g/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7762820Z 2022-11-23T02:48:45.7762928Z Running tests... 2022-11-23T02:48:45.7763190Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7763503Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7763755Z test_allgather_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7763975Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 75877 2022-11-23T02:48:45.7764178Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 75878 2022-11-23T02:48:45.7764452Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 75879 2022-11-23T02:48:45.7764678Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 75880 2022-11-23T02:48:45.7765058Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7765237Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7765625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7765822Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7766201Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7766409Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7766794Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7766990Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7767361Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7767539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7767918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7768107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7768472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7768648Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7769013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7769203Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7769460Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwdi93jw5 2022-11-23T02:48:45.7769730Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwdi93jw5/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7769984Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe7dpy2qw 2022-11-23T02:48:45.7770254Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe7dpy2qw/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7770483Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7770741Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphwhu28b4 2022-11-23T02:48:45.7771007Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphwhu28b4/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7771224Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7771479Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpilhhg3y8 2022-11-23T02:48:45.7771749Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpilhhg3y8/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7771979Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7772206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7772304Z ok (4.165s) 2022-11-23T02:48:45.7772324Z 2022-11-23T02:48:45.7772599Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7772715Z Ran 1 test in 4.165s 2022-11-23T02:48:45.7772734Z 2022-11-23T02:48:45.7772808Z OK 2022-11-23T02:48:45.7772827Z 2022-11-23T02:48:45.7772951Z Generating XML reports... 2022-11-23T02:48:45.7773392Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024251.xml 2022-11-23T02:48:45.7773816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7774003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7774389Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7774586Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7774844Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb49x_bf_ 2022-11-23T02:48:45.7775094Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb49x_bf_/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7775177Z 2022-11-23T02:48:45.7775273Z Running tests... 2022-11-23T02:48:45.7775537Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7775858Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7776128Z test_allgather_coalesced_async (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7776348Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76060 2022-11-23T02:48:45.7776565Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76061 2022-11-23T02:48:45.7776782Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 76062 2022-11-23T02:48:45.7776997Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 76063 2022-11-23T02:48:45.7777357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7777537Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7777922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7778118Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7778490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7778664Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7779043Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7779231Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7779579Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7779755Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7780131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7780324Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7780697Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7780872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7781244Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7781434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7781691Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_jc1929x 2022-11-23T02:48:45.7781943Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_jc1929x/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7782201Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp32v_a68j 2022-11-23T02:48:45.7782468Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp32v_a68j/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7782750Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7782982Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7783244Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoe4p6lcf 2022-11-23T02:48:45.7783520Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoe4p6lcf/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7783776Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpel14a4un 2022-11-23T02:48:45.7784026Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpel14a4un/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7784320Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7784551Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7784802Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7785051Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7785295Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:48:45.7785534Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:48:45.7785946Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:48:45.7786347Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:48:45.7786748Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:48:45.7787134Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:48:45.7787891Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7788004Z warnings.warn( 2022-11-23T02:48:45.7788748Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7788866Z warnings.warn( 2022-11-23T02:48:45.7789904Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7790015Z warnings.warn( 2022-11-23T02:48:45.7790753Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7790862Z warnings.warn( 2022-11-23T02:48:45.7790964Z ok (4.152s) 2022-11-23T02:48:45.7790984Z 2022-11-23T02:48:45.7791231Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7791344Z Ran 1 test in 4.152s 2022-11-23T02:48:45.7791367Z 2022-11-23T02:48:45.7791457Z OK 2022-11-23T02:48:45.7791476Z 2022-11-23T02:48:45.7791599Z Generating XML reports... 2022-11-23T02:48:45.7792040Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024258.xml 2022-11-23T02:48:45.7792494Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7792684Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7793075Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7793271Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7793514Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9vcv0nop 2022-11-23T02:48:45.7793793Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9vcv0nop/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7793875Z 2022-11-23T02:48:45.7793988Z Running tests... 2022-11-23T02:48:45.7794257Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7794575Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7794853Z test_allgather_coalesced_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7795075Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76243 2022-11-23T02:48:45.7795299Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76244 2022-11-23T02:48:45.7795500Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 76245 2022-11-23T02:48:45.7795714Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 76246 2022-11-23T02:48:45.7796089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7796268Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7796656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7796851Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7797226Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7797399Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7797776Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7797949Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7798313Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7798485Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7798863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7799051Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7799420Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7799593Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7799975Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7800146Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7800403Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoz04adr0 2022-11-23T02:48:45.7800675Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoz04adr0/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7800935Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpussmk3xp 2022-11-23T02:48:45.7801205Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpussmk3xp/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7801510Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplkwdz1kl 2022-11-23T02:48:45.7801790Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplkwdz1kl/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7802044Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbhkggr4q 2022-11-23T02:48:45.7802313Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbhkggr4q/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7802527Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7802750Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7803028Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7803253Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7804015Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7804130Z warnings.warn( 2022-11-23T02:48:45.7804876Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7804985Z warnings.warn( 2022-11-23T02:48:45.7805723Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7805835Z warnings.warn( 2022-11-23T02:48:45.7806553Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:2510: UserWarning: torch.distributed.all_gather_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7806661Z warnings.warn( 2022-11-23T02:48:45.7806760Z ok (4.020s) 2022-11-23T02:48:45.7806780Z 2022-11-23T02:48:45.7807045Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7807155Z Ran 1 test in 4.020s 2022-11-23T02:48:45.7807175Z 2022-11-23T02:48:45.7807262Z OK 2022-11-23T02:48:45.7807281Z 2022-11-23T02:48:45.7807402Z Generating XML reports... 2022-11-23T02:48:45.7807847Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024304.xml 2022-11-23T02:48:45.7808223Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7808386Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7808773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7808965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7809222Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3l168gq2 2022-11-23T02:48:45.7809492Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3l168gq2/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7809512Z 2022-11-23T02:48:45.7809617Z Running tests... 2022-11-23T02:48:45.7809882Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7810201Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7810459Z test_allgather_noncontiguous_input (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7810731Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76426 2022-11-23T02:48:45.7810962Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76427 2022-11-23T02:48:45.7811180Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 76428 2022-11-23T02:48:45.7811400Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 76429 2022-11-23T02:48:45.7811782Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7811959Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7812344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7812587Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7812950Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7813130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7813510Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7813700Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7814069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7814243Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7814618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7814809Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7815160Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7815336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7815720Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7815907Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7816168Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr5lrlxxc 2022-11-23T02:48:45.7816439Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr5lrlxxc/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7816694Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1v57pvv1 2022-11-23T02:48:45.7816967Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1v57pvv1/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7817197Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7817415Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7817673Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp99a8nip_ 2022-11-23T02:48:45.7817939Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp99a8nip_/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7818166Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7818420Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp2b21qsl 2022-11-23T02:48:45.7818684Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp2b21qsl/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7818911Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7819016Z ok (4.155s) 2022-11-23T02:48:45.7819036Z 2022-11-23T02:48:45.7819289Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7819400Z Ran 1 test in 4.155s 2022-11-23T02:48:45.7819419Z 2022-11-23T02:48:45.7819558Z OK 2022-11-23T02:48:45.7819580Z 2022-11-23T02:48:45.7819709Z Generating XML reports... 2022-11-23T02:48:45.7820152Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024310.xml 2022-11-23T02:48:45.7820529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7820707Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7821086Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7821276Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7821569Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzjcn70ut 2022-11-23T02:48:45.7821838Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzjcn70ut/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7821860Z 2022-11-23T02:48:45.7821965Z Running tests... 2022-11-23T02:48:45.7822232Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7822549Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7822800Z test_allgather_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7823022Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76609 2022-11-23T02:48:45.7823244Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76610 2022-11-23T02:48:45.7823445Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 76611 2022-11-23T02:48:45.7823667Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 76612 2022-11-23T02:48:45.7824042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7824221Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7824606Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7824797Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7825163Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7825337Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7825693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7825872Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7826258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7826449Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7826839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7827030Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7827401Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7827576Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7827953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7828128Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7828392Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd1wdhl3i 2022-11-23T02:48:45.7828663Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd1wdhl3i/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7829131Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7829418Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqenivxps 2022-11-23T02:48:45.7829693Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqenivxps/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7829926Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7830185Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpafeg93uu 2022-11-23T02:48:45.7830458Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpafeg93uu/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7830773Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnihkwu9t 2022-11-23T02:48:45.7831048Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnihkwu9t/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7831279Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7831509Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7831609Z ok (4.638s) 2022-11-23T02:48:45.7831629Z 2022-11-23T02:48:45.7831910Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7832023Z Ran 1 test in 4.638s 2022-11-23T02:48:45.7832042Z 2022-11-23T02:48:45.7832136Z OK 2022-11-23T02:48:45.7832155Z 2022-11-23T02:48:45.7832261Z Generating XML reports... 2022-11-23T02:48:45.7832700Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024317.xml 2022-11-23T02:48:45.7833082Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7833259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7833652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7833846Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7834103Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp81s97ilk 2022-11-23T02:48:45.7834369Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp81s97ilk/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7834389Z 2022-11-23T02:48:45.7834497Z Running tests... 2022-11-23T02:48:45.7834748Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7835063Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7835326Z test_allgather_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7835548Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 76816 2022-11-23T02:48:45.7835771Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 76817 2022-11-23T02:48:45.7835992Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 76818 2022-11-23T02:48:45.7836207Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 76819 2022-11-23T02:48:45.7836586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7836746Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7837134Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7837326Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7837701Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7837873Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7838362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7838567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7838937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7839092Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7839470Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7839657Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7840090Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7840265Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7840648Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7840836Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7841098Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxdamd752 2022-11-23T02:48:45.7841369Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxdamd752/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7841583Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7841837Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpziwfyfem 2022-11-23T02:48:45.7842113Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpziwfyfem/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7842371Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfl0vqms9 2022-11-23T02:48:45.7842623Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9x__gka2 2022-11-23T02:48:45.7842894Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfl0vqms9/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7843156Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9x__gka2/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7843388Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7843618Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7843827Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7843927Z ok (7.472s) 2022-11-23T02:48:45.7843947Z 2022-11-23T02:48:45.7844224Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7844336Z Ran 1 test in 7.472s 2022-11-23T02:48:45.7844356Z 2022-11-23T02:48:45.7844447Z OK 2022-11-23T02:48:45.7844466Z 2022-11-23T02:48:45.7844592Z Generating XML reports... 2022-11-23T02:48:45.7845035Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024324.xml 2022-11-23T02:48:45.7845413Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7845574Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7845962Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7846152Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7846408Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpey04xje9 2022-11-23T02:48:45.7846680Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpey04xje9/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7846700Z 2022-11-23T02:48:45.7846807Z Running tests... 2022-11-23T02:48:45.7847122Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7847449Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7847685Z test_allreduce_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7847909Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77027 2022-11-23T02:48:45.7848129Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77028 2022-11-23T02:48:45.7848344Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 77029 2022-11-23T02:48:45.7848559Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 77030 2022-11-23T02:48:45.7849012Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7849193Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7849583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7849775Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7850132Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7850308Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7850686Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7850873Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7851241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7851416Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7851796Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7851984Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7852335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7852511Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7852884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7853070Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7853325Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2bkf3hm7 2022-11-23T02:48:45.7853601Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2bkf3hm7/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7853831Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7854089Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpj44xbr0j 2022-11-23T02:48:45.7854359Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpj44xbr0j/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7854594Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk_2843u2 2022-11-23T02:48:45.7854853Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk_2843u2/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7855082Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7855335Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa5k8nezn 2022-11-23T02:48:45.7855567Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7855836Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa5k8nezn/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7856113Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7856224Z ok (4.155s) 2022-11-23T02:48:45.7856245Z 2022-11-23T02:48:45.7856500Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7856615Z Ran 1 test in 4.155s 2022-11-23T02:48:45.7856634Z 2022-11-23T02:48:45.7856728Z OK 2022-11-23T02:48:45.7856747Z 2022-11-23T02:48:45.7856872Z Generating XML reports... 2022-11-23T02:48:45.7857309Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024334.xml 2022-11-23T02:48:45.7857682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7857910Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7858345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7858545Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7858823Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvxv5cl0j 2022-11-23T02:48:45.7859312Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvxv5cl0j/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7859341Z 2022-11-23T02:48:45.7859498Z Running tests... 2022-11-23T02:48:45.7859777Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7860096Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7860358Z test_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7860586Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77210 2022-11-23T02:48:45.7860809Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77211 2022-11-23T02:48:45.7861009Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 77212 2022-11-23T02:48:45.7861233Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 77213 2022-11-23T02:48:45.7861612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7861787Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7862170Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7862362Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7862728Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7862909Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7863284Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7863460Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7863832Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7864003Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7864385Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7864575Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7864940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7865117Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7865496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7865738Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7866009Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6zixeu_w 2022-11-23T02:48:45.7866284Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6zixeu_w/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7866538Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3rk55nqr 2022-11-23T02:48:45.7866769Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7867036Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3rk55nqr/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7867262Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7867592Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfxhh__lg 2022-11-23T02:48:45.7867862Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfxhh__lg/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7868104Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6pwub24u 2022-11-23T02:48:45.7868372Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6pwub24u/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7868595Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7868822Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7868924Z ok (6.044s) 2022-11-23T02:48:45.7869151Z 2022-11-23T02:48:45.7869441Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7869560Z Ran 1 test in 6.044s 2022-11-23T02:48:45.7869584Z 2022-11-23T02:48:45.7869681Z OK 2022-11-23T02:48:45.7869700Z 2022-11-23T02:48:45.7869806Z Generating XML reports... 2022-11-23T02:48:45.7870500Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024340.xml 2022-11-23T02:48:45.7871263Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7871497Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7871901Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7872096Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7872355Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2repym3q 2022-11-23T02:48:45.7872627Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2repym3q/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7872651Z 2022-11-23T02:48:45.7872760Z Running tests... 2022-11-23T02:48:45.7873015Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7873329Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7873616Z test_allreduce_basics_cuda_using_work_api (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7873836Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77397 2022-11-23T02:48:45.7874057Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77398 2022-11-23T02:48:45.7874272Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 77399 2022-11-23T02:48:45.7874486Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 77400 2022-11-23T02:48:45.7874863Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7875027Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7875415Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7875709Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7876100Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7876277Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7876656Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7876846Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7877208Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7877450Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7877813Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7878003Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7878378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7878548Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7878920Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7879108Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7879370Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpn09oxjin 2022-11-23T02:48:45.7879642Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpn09oxjin/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7879861Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7880117Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgbabfmwe 2022-11-23T02:48:45.7880397Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgbabfmwe/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7880628Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7880883Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2ltn3kl_ 2022-11-23T02:48:45.7881149Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2ltn3kl_/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7881381Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7881634Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9m8t2v98 2022-11-23T02:48:45.7881901Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9m8t2v98/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7882112Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7882216Z ok (5.954s) 2022-11-23T02:48:45.7882236Z 2022-11-23T02:48:45.7882513Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7882626Z Ran 1 test in 5.954s 2022-11-23T02:48:45.7882645Z 2022-11-23T02:48:45.7882736Z OK 2022-11-23T02:48:45.7882755Z 2022-11-23T02:48:45.7882881Z Generating XML reports... 2022-11-23T02:48:45.7883320Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024349.xml 2022-11-23T02:48:45.7883693Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7883852Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7884243Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7884434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7884739Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0kai70j9 2022-11-23T02:48:45.7885015Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0kai70j9/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7885035Z 2022-11-23T02:48:45.7885144Z Running tests... 2022-11-23T02:48:45.7885414Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7885730Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7886002Z test_allreduce_basics_using_work_api (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7886207Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77584 2022-11-23T02:48:45.7886475Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77585 2022-11-23T02:48:45.7886695Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 77586 2022-11-23T02:48:45.7886911Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 77587 2022-11-23T02:48:45.7887293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7887472Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7887850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7888032Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7888399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7888592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7888978Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7889168Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7889543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7889717Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7890094Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7890282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7890650Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7890809Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7891189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7891377Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7891640Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1tbkosdk 2022-11-23T02:48:45.7891917Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1tbkosdk/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7892175Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr61gu36j 2022-11-23T02:48:45.7892443Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr61gu36j/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7892695Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpob2j6bl7 2022-11-23T02:48:45.7892944Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpob2j6bl7/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7893199Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg1rioizs 2022-11-23T02:48:45.7893469Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg1rioizs/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7893749Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7893989Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7894220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7894448Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7894551Z ok (4.168s) 2022-11-23T02:48:45.7894570Z 2022-11-23T02:48:45.7894844Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7894938Z Ran 1 test in 4.169s 2022-11-23T02:48:45.7894957Z 2022-11-23T02:48:45.7895051Z OK 2022-11-23T02:48:45.7895070Z 2022-11-23T02:48:45.7895245Z Generating XML reports... 2022-11-23T02:48:45.7895688Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024357.xml 2022-11-23T02:48:45.7896066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7896244Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7896629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7896822Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7897063Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvuvhm9ti 2022-11-23T02:48:45.7897335Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvuvhm9ti/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7897355Z 2022-11-23T02:48:45.7897462Z Running tests... 2022-11-23T02:48:45.7897732Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7898050Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7898303Z test_allreduce_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7898526Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77767 2022-11-23T02:48:45.7898747Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77768 2022-11-23T02:48:45.7898967Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 77769 2022-11-23T02:48:45.7899168Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 77770 2022-11-23T02:48:45.7899543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7899718Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7900105Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7900298Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7900671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7900846Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7901225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7901396Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7901760Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7901933Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7902317Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7902509Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7902872Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7903095Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7903488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7903684Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7903926Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfxtl4op9 2022-11-23T02:48:45.7904201Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfxtl4op9/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7904463Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsjrnv5_2 2022-11-23T02:48:45.7904784Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsjrnv5_2/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7905038Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphg3xc5fi 2022-11-23T02:48:45.7905313Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphg3xc5fi/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7905548Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7905803Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp031cxiw2 2022-11-23T02:48:45.7906049Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp031cxiw2/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7906281Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7906504Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7906730Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7906831Z ok (4.146s) 2022-11-23T02:48:45.7906851Z 2022-11-23T02:48:45.7907124Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7907236Z Ran 1 test in 4.147s 2022-11-23T02:48:45.7907258Z 2022-11-23T02:48:45.7907351Z OK 2022-11-23T02:48:45.7907370Z 2022-11-23T02:48:45.7907492Z Generating XML reports... 2022-11-23T02:48:45.7907912Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024404.xml 2022-11-23T02:48:45.7908288Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7908464Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7908848Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7909314Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7909583Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg2utsujz 2022-11-23T02:48:45.7909867Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg2utsujz/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7909888Z 2022-11-23T02:48:45.7909999Z Running tests... 2022-11-23T02:48:45.7910253Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7910571Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7910841Z test_allreduce_coalesced_async (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7911067Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 77950 2022-11-23T02:48:45.7911287Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 77951 2022-11-23T02:48:45.7911505Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 77952 2022-11-23T02:48:45.7911724Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 77953 2022-11-23T02:48:45.7912178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7912371Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7912742Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7912936Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7913305Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7913478Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7913861Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7914132Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7914506Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7914687Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7915044Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7915235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7915609Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7915782Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7916155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7916347Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7916606Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk96d5kn3 2022-11-23T02:48:45.7916880Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk96d5kn3/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7917135Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbaqy2r7h 2022-11-23T02:48:45.7917386Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbaqy2r7h/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7917618Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7917872Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplkdx6m08 2022-11-23T02:48:45.7918138Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplkdx6m08/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7918369Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7918601Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7918857Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5y6ho01t 2022-11-23T02:48:45.7919128Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5y6ho01t/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7919337Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7919587Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.7919836Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:48:45.7920080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 2 2022-11-23T02:48:45.7920323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 3 2022-11-23T02:48:45.7920740Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:48:45.7921200Z INFO:torch.distributed.distributed_c10d:Rank 2: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:48:45.7921614Z INFO:torch.distributed.distributed_c10d:Rank 3: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:48:45.7922009Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 4 nodes. 2022-11-23T02:48:45.7922760Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7922859Z warnings.warn( 2022-11-23T02:48:45.7923660Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7923775Z warnings.warn( 2022-11-23T02:48:45.7924517Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7924629Z warnings.warn( 2022-11-23T02:48:45.7925360Z /opt/conda/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py:1638: UserWarning: torch.distributed.all_reduce_coalesced will be deprecated. If you must use it, please revisit our documentation later at https://pytorch.org/docs/master/distributed.html#collective-functions 2022-11-23T02:48:45.7925470Z warnings.warn( 2022-11-23T02:48:45.7925570Z ok (4.152s) 2022-11-23T02:48:45.7925591Z 2022-11-23T02:48:45.7925861Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7925973Z Ran 1 test in 4.152s 2022-11-23T02:48:45.7925992Z 2022-11-23T02:48:45.7926068Z OK 2022-11-23T02:48:45.7926087Z 2022-11-23T02:48:45.7926211Z Generating XML reports... 2022-11-23T02:48:45.7926647Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024410.xml 2022-11-23T02:48:45.7927024Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7927201Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7927587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7927783Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7928045Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdojpkkv8 2022-11-23T02:48:45.7928303Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdojpkkv8/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7928340Z 2022-11-23T02:48:45.7928433Z Running tests... 2022-11-23T02:48:45.7928700Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7929015Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7929285Z test_allreduce_coalesced_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7929507Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78133 2022-11-23T02:48:45.7929728Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78134 2022-11-23T02:48:45.7929946Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 78135 2022-11-23T02:48:45.7930168Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 78136 2022-11-23T02:48:45.7930529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7930758Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7931158Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7931354Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7931727Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7931906Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7932281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7932504Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7932874Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7933072Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7933456Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7933649Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7934018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7934192Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7934566Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7934756Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7935022Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw_35emka 2022-11-23T02:48:45.7935277Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw_35emka/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7935539Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe1y5vzms 2022-11-23T02:48:45.7935792Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkhcxbrvo 2022-11-23T02:48:45.7936063Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe1y5vzms/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7936333Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkhcxbrvo/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7936564Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7936796Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7937027Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7937264Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb0d11rs0 2022-11-23T02:48:45.7937531Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb0d11rs0/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7937757Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7937858Z ok (4.108s) 2022-11-23T02:48:45.7937878Z 2022-11-23T02:48:45.7938152Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7938263Z Ran 1 test in 4.108s 2022-11-23T02:48:45.7938283Z 2022-11-23T02:48:45.7938375Z OK 2022-11-23T02:48:45.7938394Z 2022-11-23T02:48:45.7938519Z Generating XML reports... 2022-11-23T02:48:45.7938959Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024417.xml 2022-11-23T02:48:45.7939323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7939502Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7939945Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7940148Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7940409Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpeh355uj6 2022-11-23T02:48:45.7940680Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpeh355uj6/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7940699Z 2022-11-23T02:48:45.7940807Z Running tests... 2022-11-23T02:48:45.7941076Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7941373Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7941694Z test_allreduce_coalesced_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7941916Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78316 2022-11-23T02:48:45.7942145Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78317 2022-11-23T02:48:45.7942365Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 78318 2022-11-23T02:48:45.7942578Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 78319 2022-11-23T02:48:45.7942961Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7943139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7943529Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7943703Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7944079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7944253Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7944636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7944829Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7945193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7945365Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7945740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7945973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7946360Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7946536Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7946915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7947107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7947364Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe_64f24j 2022-11-23T02:48:45.7947633Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe_64f24j/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7947891Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv5ba66io 2022-11-23T02:48:45.7948155Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv5ba66io/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7948373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7948603Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7948911Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgrwsl_i8 2022-11-23T02:48:45.7949523Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgrwsl_i8/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7949761Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7950020Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2q4c_qz5 2022-11-23T02:48:45.7950290Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2q4c_qz5/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7950522Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7950604Z ok (4.119s) 2022-11-23T02:48:45.7950721Z 2022-11-23T02:48:45.7950992Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7951107Z Ran 1 test in 4.119s 2022-11-23T02:48:45.7951127Z 2022-11-23T02:48:45.7951221Z OK 2022-11-23T02:48:45.7951240Z 2022-11-23T02:48:45.7951367Z Generating XML reports... 2022-11-23T02:48:45.7951814Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024423.xml 2022-11-23T02:48:45.7952190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7952369Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7952755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7952929Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7953187Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1xhe1xqi 2022-11-23T02:48:45.7953463Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1xhe1xqi/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7953482Z 2022-11-23T02:48:45.7953589Z Running tests... 2022-11-23T02:48:45.7953858Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7954171Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7954451Z test_allreduce_coalesced_checks_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7954673Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78499 2022-11-23T02:48:45.7954875Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78500 2022-11-23T02:48:45.7955094Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 78501 2022-11-23T02:48:45.7955306Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 78502 2022-11-23T02:48:45.7955685Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7955861Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7956250Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7956444Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7956819Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7956996Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7957357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7957548Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7957921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7958096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7958585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7958784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7959162Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7959337Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7959694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7959887Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7960146Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq7citlwh 2022-11-23T02:48:45.7960476Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq7citlwh/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7960734Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq92xb2vn 2022-11-23T02:48:45.7961015Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq92xb2vn/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7961272Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps248kx9u 2022-11-23T02:48:45.7961538Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps248kx9u/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7961768Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7961977Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7962206Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7962464Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwti24kuy 2022-11-23T02:48:45.7962729Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwti24kuy/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7962956Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7963059Z ok (6.092s) 2022-11-23T02:48:45.7963078Z 2022-11-23T02:48:45.7963351Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7963463Z Ran 1 test in 6.092s 2022-11-23T02:48:45.7963482Z 2022-11-23T02:48:45.7963556Z OK 2022-11-23T02:48:45.7963594Z 2022-11-23T02:48:45.7963701Z Generating XML reports... 2022-11-23T02:48:45.7964140Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024429.xml 2022-11-23T02:48:45.7964511Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7964692Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7965078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7965274Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7965532Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqu4gpvle 2022-11-23T02:48:45.7965801Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqu4gpvle/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7965821Z 2022-11-23T02:48:45.7965910Z Running tests... 2022-11-23T02:48:45.7966176Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7966492Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7966762Z test_allreduce_coalesced_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7966990Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78686 2022-11-23T02:48:45.7967208Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78687 2022-11-23T02:48:45.7967476Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 78688 2022-11-23T02:48:45.7967698Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 78689 2022-11-23T02:48:45.7968060Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7968241Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7968629Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7968823Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7969193Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7969420Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7969808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7970005Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7970374Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7970530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7970907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7971097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7971469Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7971645Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7972021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7972211Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7972474Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppskuc7hc 2022-11-23T02:48:45.7972733Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppskuc7hc/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7972992Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx06aj1_q 2022-11-23T02:48:45.7973260Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx06aj1_q/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7973489Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7973724Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7973976Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpvge2qrjo 2022-11-23T02:48:45.7974250Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpvge2qrjo/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7974506Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcttc9rbo 2022-11-23T02:48:45.7974774Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcttc9rbo/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7974985Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7975213Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7975313Z ok (4.549s) 2022-11-23T02:48:45.7975334Z 2022-11-23T02:48:45.7975607Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7975722Z Ran 1 test in 4.549s 2022-11-23T02:48:45.7975741Z 2022-11-23T02:48:45.7975832Z OK 2022-11-23T02:48:45.7975850Z 2022-11-23T02:48:45.7975973Z Generating XML reports... 2022-11-23T02:48:45.7976409Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024438.xml 2022-11-23T02:48:45.7976812Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7976998Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7977387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7977583Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7977845Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplhmzqhap 2022-11-23T02:48:45.7978121Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplhmzqhap/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7978248Z 2022-11-23T02:48:45.7978364Z Running tests... 2022-11-23T02:48:45.7978631Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7978952Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7979187Z test_allreduce_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7979412Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 78893 2022-11-23T02:48:45.7979632Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 78894 2022-11-23T02:48:45.7979851Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 78895 2022-11-23T02:48:45.7980068Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 78896 2022-11-23T02:48:45.7980445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7980626Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7981015Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7981192Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7981564Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7981740Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7982117Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7982312Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7982672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7982849Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7983224Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7983413Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7983773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7983947Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7984319Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7984508Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7984768Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp16jcdpd8 2022-11-23T02:48:45.7985027Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx8fcsio0 2022-11-23T02:48:45.7985303Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp16jcdpd8/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7985567Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx8fcsio0/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7985856Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu34zqxy4 2022-11-23T02:48:45.7986138Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu34zqxy4/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7986373Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.7986601Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.7986859Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb_phal60 2022-11-23T02:48:45.7987125Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb_phal60/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7987406Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.7987636Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7987739Z ok (4.350s) 2022-11-23T02:48:45.7987759Z 2022-11-23T02:48:45.7988018Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7988132Z Ran 1 test in 4.350s 2022-11-23T02:48:45.7988152Z 2022-11-23T02:48:45.7988244Z OK 2022-11-23T02:48:45.7988263Z 2022-11-23T02:48:45.7988388Z Generating XML reports... 2022-11-23T02:48:45.7988828Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024445.xml 2022-11-23T02:48:45.7989549Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7989729Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7990124Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7990300Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7990562Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2ckg6lot 2022-11-23T02:48:45.7990836Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2ckg6lot/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7990857Z 2022-11-23T02:48:45.7990966Z Running tests... 2022-11-23T02:48:45.7991233Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.7991550Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.7991810Z test_allreduce_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.7992031Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79100 2022-11-23T02:48:45.7992255Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79101 2022-11-23T02:48:45.7992455Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 79102 2022-11-23T02:48:45.7992669Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 79103 2022-11-23T02:48:45.7993049Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7993225Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7993608Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7993799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7994172Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7994346Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7994712Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7994901Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7995345Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7995529Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7995912Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7996103Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7996472Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.7996649Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.7997096Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.7997267Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.7997533Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpr0e75hvn 2022-11-23T02:48:45.7997813Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpr0e75hvn/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7998070Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo_vu62yl 2022-11-23T02:48:45.7998340Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo_vu62yl/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7998573Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.7998831Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjm7dgftv 2022-11-23T02:48:45.7999104Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjm7dgftv/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7999343Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpq34l8di1 2022-11-23T02:48:45.7999615Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpq34l8di1/_remote_module_non_scriptable.py 2022-11-23T02:48:45.7999844Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8000073Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8000300Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8000401Z ok (6.331s) 2022-11-23T02:48:45.8000421Z 2022-11-23T02:48:45.8000691Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8000801Z Ran 1 test in 6.332s 2022-11-23T02:48:45.8000820Z 2022-11-23T02:48:45.8000893Z OK 2022-11-23T02:48:45.8000931Z 2022-11-23T02:48:45.8001041Z Generating XML reports... 2022-11-23T02:48:45.8001479Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024451.xml 2022-11-23T02:48:45.8001857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8002036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8002421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8002612Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8002868Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7ppvnsq0 2022-11-23T02:48:45.8003138Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7ppvnsq0/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8003157Z 2022-11-23T02:48:45.8003246Z Running tests... 2022-11-23T02:48:45.8003520Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8003833Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8004087Z test_barrier_implies_wait (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8004364Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79311 2022-11-23T02:48:45.8004593Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79312 2022-11-23T02:48:45.8004801Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 79313 2022-11-23T02:48:45.8005008Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 79314 2022-11-23T02:48:45.8005366Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8005531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8005958Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8006139Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8006499Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8006669Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8007038Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8007227Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8007594Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8007749Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8008135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8008328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8008706Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8008879Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8009257Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8009445Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8009704Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpaf3i2rq3 2022-11-23T02:48:45.8009960Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpaf3i2rq3/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8010217Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpor2byha2 2022-11-23T02:48:45.8010490Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpor2byha2/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8010722Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8010952Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8011210Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0zq4p4ia 2022-11-23T02:48:45.8011478Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0zq4p4ia/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8011732Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0uafu326 2022-11-23T02:48:45.8011996Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0uafu326/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8012204Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8012436Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8012536Z ok (4.167s) 2022-11-23T02:48:45.8012557Z 2022-11-23T02:48:45.8012829Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8012941Z Ran 1 test in 4.168s 2022-11-23T02:48:45.8013008Z 2022-11-23T02:48:45.8013106Z OK 2022-11-23T02:48:45.8013125Z 2022-11-23T02:48:45.8013251Z Generating XML reports... 2022-11-23T02:48:45.8013693Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024500.xml 2022-11-23T02:48:45.8014068Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8014228Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8014611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8014855Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8015115Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp45nulfnn 2022-11-23T02:48:45.8015393Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp45nulfnn/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8015412Z 2022-11-23T02:48:45.8015520Z Running tests... 2022-11-23T02:48:45.8015787Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8016104Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8016337Z test_broadcast_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8016558Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79494 2022-11-23T02:48:45.8016777Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79495 2022-11-23T02:48:45.8016993Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 79496 2022-11-23T02:48:45.8017212Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 79497 2022-11-23T02:48:45.8017589Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8017768Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8018153Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8018328Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8018699Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8018874Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8019253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8019447Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8019811Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8019987Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8020364Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8020551Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8020911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8021083Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8021460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8021650Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8021909Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpai7go8n9 2022-11-23T02:48:45.8022231Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpai7go8n9/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8022474Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8022734Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxcesa6m6 2022-11-23T02:48:45.8022986Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxcesa6m6/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8023220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8023480Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnn05dnph 2022-11-23T02:48:45.8023747Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnn05dnph/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8024026Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8024284Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpbmy51cvd 2022-11-23T02:48:45.8024556Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpbmy51cvd/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8024782Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8024881Z ok (4.159s) 2022-11-23T02:48:45.8024901Z 2022-11-23T02:48:45.8025153Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8025266Z Ran 1 test in 4.160s 2022-11-23T02:48:45.8025285Z 2022-11-23T02:48:45.8025377Z OK 2022-11-23T02:48:45.8025396Z 2022-11-23T02:48:45.8025523Z Generating XML reports... 2022-11-23T02:48:45.8025963Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024507.xml 2022-11-23T02:48:45.8026342Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8026518Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8026904Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8027080Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8027337Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3v8fyhik 2022-11-23T02:48:45.8027607Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3v8fyhik/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8027627Z 2022-11-23T02:48:45.8027734Z Running tests... 2022-11-23T02:48:45.8027999Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8028314Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8028577Z test_broadcast_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8028798Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79677 2022-11-23T02:48:45.8029205Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79678 2022-11-23T02:48:45.8029416Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 79679 2022-11-23T02:48:45.8029636Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 79680 2022-11-23T02:48:45.8030020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8030199Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8030586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8030781Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8031155Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8031331Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8031771Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8031979Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8032353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8032527Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8032907Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8033099Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8033543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8033718Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8034099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8034272Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8034534Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcx_o5eyb 2022-11-23T02:48:45.8034807Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcx_o5eyb/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8035042Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8035301Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp378ct849 2022-11-23T02:48:45.8035571Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp378ct849/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8035825Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwbwrdb4h 2022-11-23T02:48:45.8036098Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwbwrdb4h/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8036335Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpsutjzmcf 2022-11-23T02:48:45.8036607Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpsutjzmcf/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8036838Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8037068Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8037295Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8037395Z ok (6.020s) 2022-11-23T02:48:45.8037419Z 2022-11-23T02:48:45.8037689Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8037801Z Ran 1 test in 6.020s 2022-11-23T02:48:45.8037820Z 2022-11-23T02:48:45.8037910Z OK 2022-11-23T02:48:45.8037930Z 2022-11-23T02:48:45.8038039Z Generating XML reports... 2022-11-23T02:48:45.8038481Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024513.xml 2022-11-23T02:48:45.8038858Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8039036Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8039421Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8039616Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8039873Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyyk8kbga 2022-11-23T02:48:45.8040149Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyyk8kbga/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8040168Z 2022-11-23T02:48:45.8040279Z Running tests... 2022-11-23T02:48:45.8040582Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8040907Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8041162Z test_broadcast_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8041385Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 79864 2022-11-23T02:48:45.8041605Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 79865 2022-11-23T02:48:45.8041824Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 79866 2022-11-23T02:48:45.8042045Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 79867 2022-11-23T02:48:45.8042491Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8042652Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8043041Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8043235Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8043612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8043787Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8044165Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8044357Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8044731Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8044888Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8045269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8045459Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8045827Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8046002Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8046375Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8046565Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8046829Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl4duq2og 2022-11-23T02:48:45.8047105Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl4duq2og/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8047342Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_oqtzfv2 2022-11-23T02:48:45.8047616Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_oqtzfv2/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8047848Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8048077Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8048334Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpofpgsdym 2022-11-23T02:48:45.8048606Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpofpgsdym/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8048835Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8049095Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkqh8e0kk 2022-11-23T02:48:45.8049349Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkqh8e0kk/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8049646Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8049757Z ok (4.159s) 2022-11-23T02:48:45.8049778Z 2022-11-23T02:48:45.8050051Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8050165Z Ran 1 test in 4.160s 2022-11-23T02:48:45.8050185Z 2022-11-23T02:48:45.8050279Z OK 2022-11-23T02:48:45.8050298Z 2022-11-23T02:48:45.8050424Z Generating XML reports... 2022-11-23T02:48:45.8050862Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024521.xml 2022-11-23T02:48:45.8051235Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8051446Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8051836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8052033Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8052295Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi1bjbcmg 2022-11-23T02:48:45.8052570Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi1bjbcmg/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8052590Z 2022-11-23T02:48:45.8052701Z Running tests... 2022-11-23T02:48:45.8052972Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8053287Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8053523Z test_broadcast_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8053748Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80047 2022-11-23T02:48:45.8053967Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80048 2022-11-23T02:48:45.8054187Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 80049 2022-11-23T02:48:45.8054402Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 80050 2022-11-23T02:48:45.8054781Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8054959Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8055344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8055537Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8055890Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8056070Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8056451Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8056645Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8057013Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8057186Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8057560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8057750Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8058106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8058325Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8058707Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8058951Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8059220Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph3b66k48 2022-11-23T02:48:45.8059493Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph3b66k48/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8059729Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8059987Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4eng4moj 2022-11-23T02:48:45.8060257Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4eng4moj/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8060541Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpkuvt3sju 2022-11-23T02:48:45.8060812Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpkuvt3sju/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8061071Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0cho6r04 2022-11-23T02:48:45.8061339Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0cho6r04/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8061570Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8061799Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8062021Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8062124Z ok (4.228s) 2022-11-23T02:48:45.8062143Z 2022-11-23T02:48:45.8062398Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8062514Z Ran 1 test in 4.228s 2022-11-23T02:48:45.8062533Z 2022-11-23T02:48:45.8062627Z OK 2022-11-23T02:48:45.8062646Z 2022-11-23T02:48:45.8062772Z Generating XML reports... 2022-11-23T02:48:45.8063220Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024528.xml 2022-11-23T02:48:45.8063593Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8063771Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8064157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8064350Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8064593Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc207mhso 2022-11-23T02:48:45.8064866Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc207mhso/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8064889Z 2022-11-23T02:48:45.8064999Z Running tests... 2022-11-23T02:48:45.8065268Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8065585Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8065846Z test_broadcast_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8066070Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80254 2022-11-23T02:48:45.8066294Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80255 2022-11-23T02:48:45.8066493Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 80256 2022-11-23T02:48:45.8066715Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 80257 2022-11-23T02:48:45.8067093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8067274Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8067661Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8067903Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8068290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8068469Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8068855Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8069213Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8069600Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8069856Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8070241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8070440Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8070807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8070981Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8071355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8071526Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8071786Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgwglsnom 2022-11-23T02:48:45.8072061Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgwglsnom/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8072322Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpu79xckjo 2022-11-23T02:48:45.8072594Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpu79xckjo/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8072852Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3m6ro8u0 2022-11-23T02:48:45.8073120Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3m6ro8u0/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8073354Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8073585Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8073798Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8074054Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpty2loe2m 2022-11-23T02:48:45.8074324Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpty2loe2m/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8074551Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8074654Z ok (6.261s) 2022-11-23T02:48:45.8074678Z 2022-11-23T02:48:45.8074951Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8075065Z Ran 1 test in 6.261s 2022-11-23T02:48:45.8075084Z 2022-11-23T02:48:45.8075178Z OK 2022-11-23T02:48:45.8075196Z 2022-11-23T02:48:45.8075301Z Generating XML reports... 2022-11-23T02:48:45.8075744Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024535.xml 2022-11-23T02:48:45.8076118Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8076295Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8076687Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8076879Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8077203Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpja2jskoo 2022-11-23T02:48:45.8077489Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpja2jskoo/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8077509Z 2022-11-23T02:48:45.8077622Z Running tests... 2022-11-23T02:48:45.8077873Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8078194Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8078441Z test_empty_tensors (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8078665Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80465 2022-11-23T02:48:45.8078940Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80466 2022-11-23T02:48:45.8079161Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 80467 2022-11-23T02:48:45.8079378Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 80468 2022-11-23T02:48:45.8079761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8079921Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8080307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8080499Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8080867Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8081041Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8081425Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8081619Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8081989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8082145Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8082525Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8082717Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8083088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8083262Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8083652Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8083842Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8084106Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpg2mk5095 2022-11-23T02:48:45.8084379Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpg2mk5095/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8084618Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp_r9h03t2 2022-11-23T02:48:45.8084884Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp_r9h03t2/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8085140Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2vv7jptn 2022-11-23T02:48:45.8085408Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2vv7jptn/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8085662Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpho5cu_50 2022-11-23T02:48:45.8085930Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpho5cu_50/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8086220Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8086464Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8086696Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8086907Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8087009Z ok (4.135s) 2022-11-23T02:48:45.8087030Z 2022-11-23T02:48:45.8087303Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8087416Z Ran 1 test in 4.135s 2022-11-23T02:48:45.8087435Z 2022-11-23T02:48:45.8087528Z OK 2022-11-23T02:48:45.8087547Z 2022-11-23T02:48:45.8087722Z Generating XML reports... 2022-11-23T02:48:45.8088162Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024543.xml 2022-11-23T02:48:45.8088541Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8088699Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8089083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8089275Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8089530Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4vpe93zj 2022-11-23T02:48:45.8089800Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4vpe93zj/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8089820Z 2022-11-23T02:48:45.8089928Z Running tests... 2022-11-23T02:48:45.8090198Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8090511Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8090739Z test_gather_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8090964Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80648 2022-11-23T02:48:45.8091185Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80649 2022-11-23T02:48:45.8091401Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 80650 2022-11-23T02:48:45.8091616Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 80651 2022-11-23T02:48:45.8091989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8092167Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8092557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8092748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8093102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8093282Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8093664Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8093854Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8094227Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8094400Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8094777Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8094973Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8095324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8095552Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8095940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8096131Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8096395Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa8hpnkya 2022-11-23T02:48:45.8096674Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa8hpnkya/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8096908Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8097217Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl5r_z3pf 2022-11-23T02:48:45.8097490Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl5r_z3pf/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8097729Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9eds4_6d 2022-11-23T02:48:45.8097997Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9eds4_6d/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8098226Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8098449Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8098701Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1_5zup5d 2022-11-23T02:48:45.8098969Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1_5zup5d/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8099202Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8099302Z ok (4.145s) 2022-11-23T02:48:45.8099323Z 2022-11-23T02:48:45.8099580Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8099693Z Ran 1 test in 4.145s 2022-11-23T02:48:45.8099716Z 2022-11-23T02:48:45.8099810Z OK 2022-11-23T02:48:45.8099829Z 2022-11-23T02:48:45.8099955Z Generating XML reports... 2022-11-23T02:48:45.8100397Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024550.xml 2022-11-23T02:48:45.8100775Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8100951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8101338Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8101535Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8101775Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv39hjf9o 2022-11-23T02:48:45.8102050Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv39hjf9o/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8102070Z 2022-11-23T02:48:45.8102179Z Running tests... 2022-11-23T02:48:45.8102445Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8102762Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8103017Z test_gather_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8103238Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 80831 2022-11-23T02:48:45.8103458Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 80832 2022-11-23T02:48:45.8103659Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 80833 2022-11-23T02:48:45.8103884Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 80834 2022-11-23T02:48:45.8104258Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8104484Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8104881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8105075Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8105445Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8105621Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8106009Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8106244Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8106618Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8106797Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8107183Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8107375Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8107743Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8107916Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8108289Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8108466Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8108729Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgaogdhih 2022-11-23T02:48:45.8109249Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgaogdhih/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8109522Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo49cxm3r 2022-11-23T02:48:45.8109796Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo49cxm3r/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8110031Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8110288Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpimd_umuv 2022-11-23T02:48:45.8110558Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpimd_umuv/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8110792Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8111007Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8111262Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprpefk0uk 2022-11-23T02:48:45.8111536Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprpefk0uk/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8111765Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8111868Z ok (5.977s) 2022-11-23T02:48:45.8111890Z 2022-11-23T02:48:45.8112170Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8112285Z Ran 1 test in 5.977s 2022-11-23T02:48:45.8112304Z 2022-11-23T02:48:45.8112397Z OK 2022-11-23T02:48:45.8112416Z 2022-11-23T02:48:45.8112521Z Generating XML reports... 2022-11-23T02:48:45.8112961Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024556.xml 2022-11-23T02:48:45.8113344Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8113521Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8113981Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8114186Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8114446Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3tutluds 2022-11-23T02:48:45.8114726Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3tutluds/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8114746Z 2022-11-23T02:48:45.8114856Z Running tests... 2022-11-23T02:48:45.8115108Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8115423Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8115734Z test_gather_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8115958Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81018 2022-11-23T02:48:45.8116185Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81019 2022-11-23T02:48:45.8116404Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 81020 2022-11-23T02:48:45.8116620Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 81021 2022-11-23T02:48:45.8117001Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8117160Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8117545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8117743Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8118112Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8118286Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8118667Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8118859Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8119225Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8119397Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8119755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8119944Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8120324Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8120499Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8120885Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8121076Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8121336Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpjmtesgzc 2022-11-23T02:48:45.8121611Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpjmtesgzc/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8121855Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv7egu9_u 2022-11-23T02:48:45.8122126Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv7egu9_u/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8122359Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8122591Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8122949Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp00gp3aug 2022-11-23T02:48:45.8123230Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp00gp3aug/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8123464Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8123722Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk8fzi11b 2022-11-23T02:48:45.8123993Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk8fzi11b/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8124205Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8124306Z ok (4.179s) 2022-11-23T02:48:45.8124371Z 2022-11-23T02:48:45.8124652Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8124767Z Ran 1 test in 4.179s 2022-11-23T02:48:45.8124787Z 2022-11-23T02:48:45.8124880Z OK 2022-11-23T02:48:45.8124899Z 2022-11-23T02:48:45.8125025Z Generating XML reports... 2022-11-23T02:48:45.8125467Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024604.xml 2022-11-23T02:48:45.8125846Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8126006Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8126393Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8126586Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8126840Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp53xs_3nc 2022-11-23T02:48:45.8127109Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp53xs_3nc/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8127128Z 2022-11-23T02:48:45.8127236Z Running tests... 2022-11-23T02:48:45.8127510Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8127828Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8128097Z test_gather_noncontiguous_input (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8128301Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81201 2022-11-23T02:48:45.8128524Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81202 2022-11-23T02:48:45.8128739Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 81203 2022-11-23T02:48:45.8128953Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 81204 2022-11-23T02:48:45.8129334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8129513Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8129905Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8130098Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8130449Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8130627Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8131003Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8131194Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8131571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8131745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8132168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8132366Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8132740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8132898Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8133275Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8133462Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8133724Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppgbq_tw3 2022-11-23T02:48:45.8134053Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppgbq_tw3/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8134287Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8134549Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxst6nia5 2022-11-23T02:48:45.8134825Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxst6nia5/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8135063Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb6822xaz 2022-11-23T02:48:45.8135328Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb6822xaz/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8135556Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8135787Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8136042Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp87auk72r 2022-11-23T02:48:45.8136307Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp87auk72r/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8136540Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8136642Z ok (4.190s) 2022-11-23T02:48:45.8136661Z 2022-11-23T02:48:45.8136936Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8137031Z Ran 1 test in 4.190s 2022-11-23T02:48:45.8137051Z 2022-11-23T02:48:45.8137143Z OK 2022-11-23T02:48:45.8137162Z 2022-11-23T02:48:45.8137288Z Generating XML reports... 2022-11-23T02:48:45.8137726Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024611.xml 2022-11-23T02:48:45.8138099Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8138281Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8138670Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8138866Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8139103Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpklqw99na 2022-11-23T02:48:45.8139371Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpklqw99na/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8139390Z 2022-11-23T02:48:45.8139498Z Running tests... 2022-11-23T02:48:45.8139765Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8140081Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8140324Z test_gather_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8140547Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81384 2022-11-23T02:48:45.8140769Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81385 2022-11-23T02:48:45.8141037Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 81386 2022-11-23T02:48:45.8141244Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 81387 2022-11-23T02:48:45.8141625Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8141804Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8142190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8142383Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8142751Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8142977Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8143365Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8143539Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8143916Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8144093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8144475Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8144663Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8145030Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8145207Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8145586Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8145781Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8146023Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprqbagjyt 2022-11-23T02:48:45.8146297Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprqbagjyt/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8146552Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpty_3zu2d 2022-11-23T02:48:45.8146818Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpty_3zu2d/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8147076Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpc3uuu0kv 2022-11-23T02:48:45.8147350Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpc3uuu0kv/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8147582Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8147839Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb_pgu6cp 2022-11-23T02:48:45.8148087Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb_pgu6cp/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8148318Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8148539Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8148765Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8148867Z ok (4.756s) 2022-11-23T02:48:45.8148887Z 2022-11-23T02:48:45.8149403Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8149526Z Ran 1 test in 4.757s 2022-11-23T02:48:45.8149546Z 2022-11-23T02:48:45.8149641Z OK 2022-11-23T02:48:45.8149661Z 2022-11-23T02:48:45.8149789Z Generating XML reports... 2022-11-23T02:48:45.8150291Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024618.xml 2022-11-23T02:48:45.8150688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8150869Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8151256Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8151453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8151711Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpe5pxfpx8 2022-11-23T02:48:45.8151987Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpe5pxfpx8/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8152091Z 2022-11-23T02:48:45.8152208Z Running tests... 2022-11-23T02:48:45.8152459Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8152781Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8153039Z test_gather_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8153264Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81591 2022-11-23T02:48:45.8153483Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81592 2022-11-23T02:48:45.8153703Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 81593 2022-11-23T02:48:45.8153920Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 81594 2022-11-23T02:48:45.8154298Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8154462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8154853Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8155048Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8155418Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8155593Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8155973Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8156165Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8156533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8156711Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8157069Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8157262Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8157635Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8157807Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8158181Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8158415Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8158679Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp21w1z7w5 2022-11-23T02:48:45.8158954Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp21w1z7w5/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8159220Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd58783rw 2022-11-23T02:48:45.8159471Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd58783rw/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8159781Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8_9bgwgv 2022-11-23T02:48:45.8160058Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8_9bgwgv/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8160291Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8160520Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8160752Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8161008Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplyg82in_ 2022-11-23T02:48:45.8161324Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplyg82in_/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8161529Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8161632Z ok (7.740s) 2022-11-23T02:48:45.8161653Z 2022-11-23T02:48:45.8161929Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8162043Z Ran 1 test in 7.740s 2022-11-23T02:48:45.8162062Z 2022-11-23T02:48:45.8162156Z OK 2022-11-23T02:48:45.8162175Z 2022-11-23T02:48:45.8162301Z Generating XML reports... 2022-11-23T02:48:45.8162741Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024625.xml 2022-11-23T02:48:45.8163113Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8163289Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8163662Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8163857Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8164117Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9gn0hltx 2022-11-23T02:48:45.8164389Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9gn0hltx/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8164409Z 2022-11-23T02:48:45.8164520Z Running tests... 2022-11-23T02:48:45.8164788Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8165103Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8165367Z test_multi_device_constructor (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8165572Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81802 2022-11-23T02:48:45.8165797Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81803 2022-11-23T02:48:45.8166014Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 81804 2022-11-23T02:48:45.8166229Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 81805 2022-11-23T02:48:45.8166607Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8166785Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8167167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8167358Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8167709Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8167885Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8168269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8168458Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8168879Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8169058Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8169438Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8169626Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8169993Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8170151Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8170596Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8170784Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8171051Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3beua6h4 2022-11-23T02:48:45.8171325Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3beua6h4/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8171582Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp44rrl0lm 2022-11-23T02:48:45.8171854Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp44rrl0lm/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8172086Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8172323Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9zpg4xw7 2022-11-23T02:48:45.8172591Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9zpg4xw7/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8172852Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4m3gkx2v 2022-11-23T02:48:45.8173126Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4m3gkx2v/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8173356Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8173586Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8173806Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8173907Z ok (4.133s) 2022-11-23T02:48:45.8173928Z 2022-11-23T02:48:45.8174201Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8174296Z Ran 1 test in 4.134s 2022-11-23T02:48:45.8174316Z 2022-11-23T02:48:45.8174407Z OK 2022-11-23T02:48:45.8174426Z 2022-11-23T02:48:45.8174557Z Generating XML reports... 2022-11-23T02:48:45.8174995Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024635.xml 2022-11-23T02:48:45.8175372Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8175551Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8175936Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8176129Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8176368Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw1gwq6b9 2022-11-23T02:48:45.8176638Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw1gwq6b9/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8176658Z 2022-11-23T02:48:45.8176766Z Running tests... 2022-11-23T02:48:45.8177038Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8177353Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8177598Z test_reduce_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8177867Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 81989 2022-11-23T02:48:45.8178095Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 81990 2022-11-23T02:48:45.8178317Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 81991 2022-11-23T02:48:45.8178519Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 81992 2022-11-23T02:48:45.8178899Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8179080Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8179521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8179715Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8180087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8180265Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8180645Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8180816Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8181184Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8181359Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8181736Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8181931Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8182307Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8182482Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8182857Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8183047Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8183289Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2uykmadi 2022-11-23T02:48:45.8183564Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2uykmadi/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8183795Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8184053Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplq8x2rfm 2022-11-23T02:48:45.8184324Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplq8x2rfm/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8184582Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmprsb958sb 2022-11-23T02:48:45.8184853Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmprsb958sb/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8185081Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8185293Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8185546Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv7e5azyl 2022-11-23T02:48:45.8185815Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv7e5azyl/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8186049Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8186149Z ok (4.132s) 2022-11-23T02:48:45.8186169Z 2022-11-23T02:48:45.8186441Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8186605Z Ran 1 test in 4.132s 2022-11-23T02:48:45.8186626Z 2022-11-23T02:48:45.8186723Z OK 2022-11-23T02:48:45.8186742Z 2022-11-23T02:48:45.8186868Z Generating XML reports... 2022-11-23T02:48:45.8187294Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024641.xml 2022-11-23T02:48:45.8187671Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8187850Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8188240Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8188485Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8188746Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9rk_lyg6 2022-11-23T02:48:45.8189256Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9rk_lyg6/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8189277Z 2022-11-23T02:48:45.8189394Z Running tests... 2022-11-23T02:48:45.8189652Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8189970Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8190229Z test_reduce_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8190450Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82172 2022-11-23T02:48:45.8190669Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82173 2022-11-23T02:48:45.8190886Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 82174 2022-11-23T02:48:45.8191104Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 82175 2022-11-23T02:48:45.8191484Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8191661Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8192020Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8192198Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8192583Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8192776Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8193157Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8193353Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8193725Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8193902Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8194269Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8194456Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8194839Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8195014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8195392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8195586Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8195850Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfu2s3a4t 2022-11-23T02:48:45.8196202Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfu2s3a4t/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8196476Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqm5nl8kt 2022-11-23T02:48:45.8196729Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqm5nl8kt/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8196989Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx8zmwi75 2022-11-23T02:48:45.8197258Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx8zmwi75/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8197490Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8197718Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8198038Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp2inepvch 2022-11-23T02:48:45.8198309Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp2inepvch/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8198544Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8198750Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8198853Z ok (5.930s) 2022-11-23T02:48:45.8198873Z 2022-11-23T02:48:45.8199142Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8199256Z Ran 1 test in 5.930s 2022-11-23T02:48:45.8199276Z 2022-11-23T02:48:45.8199367Z OK 2022-11-23T02:48:45.8199387Z 2022-11-23T02:48:45.8199510Z Generating XML reports... 2022-11-23T02:48:45.8199950Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024648.xml 2022-11-23T02:48:45.8200331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8200509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8200881Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8201078Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8201338Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpmpk7n17z 2022-11-23T02:48:45.8201611Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpmpk7n17z/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8201630Z 2022-11-23T02:48:45.8201740Z Running tests... 2022-11-23T02:48:45.8202007Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8202323Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8202572Z test_reduce_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8202774Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82359 2022-11-23T02:48:45.8202996Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82360 2022-11-23T02:48:45.8203214Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 82361 2022-11-23T02:48:45.8203432Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 82362 2022-11-23T02:48:45.8203808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8203986Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8204372Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8204567Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8204919Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8205096Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8205523Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8205721Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8206088Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8206262Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8206639Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8206829Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8207246Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8207401Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8207790Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8207977Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8208235Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpf7oy_kqy 2022-11-23T02:48:45.8208504Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpf7oy_kqy/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8208761Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1f04ns7g 2022-11-23T02:48:45.8209032Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1f04ns7g/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8209294Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp50gq9ybw 2022-11-23T02:48:45.8209546Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp50gq9ybw/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8209783Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8210012Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8210242Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8210497Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqe15xk26 2022-11-23T02:48:45.8210766Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqe15xk26/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8210994Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8211095Z ok (4.160s) 2022-11-23T02:48:45.8211119Z 2022-11-23T02:48:45.8211391Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8211487Z Ran 1 test in 4.160s 2022-11-23T02:48:45.8211506Z 2022-11-23T02:48:45.8211600Z OK 2022-11-23T02:48:45.8211620Z 2022-11-23T02:48:45.8211745Z Generating XML reports... 2022-11-23T02:48:45.8212182Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024656.xml 2022-11-23T02:48:45.8212556Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8212733Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8213119Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8213314Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8213555Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplev6rrcy 2022-11-23T02:48:45.8213832Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplev6rrcy/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8213852Z 2022-11-23T02:48:45.8213963Z Running tests... 2022-11-23T02:48:45.8214299Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8214630Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8214875Z test_reduce_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8215097Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82542 2022-11-23T02:48:45.8215319Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82543 2022-11-23T02:48:45.8215537Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 82544 2022-11-23T02:48:45.8215736Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 82545 2022-11-23T02:48:45.8216167Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8216343Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8216732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8216923Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8217290Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8217466Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8217843Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8218014Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8218383Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8218558Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8218944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8219132Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8219496Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8219670Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8220050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8220239Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8220483Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4qi0lfdb 2022-11-23T02:48:45.8220761Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4qi0lfdb/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8221016Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcuglmrcy 2022-11-23T02:48:45.8221290Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcuglmrcy/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8221524Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8221751Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8222007Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp73nx06q4 2022-11-23T02:48:45.8222271Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp73nx06q4/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8222481Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8222744Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpygj2vnyp 2022-11-23T02:48:45.8223019Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpygj2vnyp/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8223297Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8223408Z ok (4.523s) 2022-11-23T02:48:45.8223429Z 2022-11-23T02:48:45.8223708Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8223822Z Ran 1 test in 4.524s 2022-11-23T02:48:45.8223842Z 2022-11-23T02:48:45.8223934Z OK 2022-11-23T02:48:45.8223953Z 2022-11-23T02:48:45.8224077Z Generating XML reports... 2022-11-23T02:48:45.8224496Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024702.xml 2022-11-23T02:48:45.8224868Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8225113Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8225503Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8225701Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8225963Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5on0pn2q 2022-11-23T02:48:45.8226235Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5on0pn2q/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8226255Z 2022-11-23T02:48:45.8226365Z Running tests... 2022-11-23T02:48:45.8226617Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8226937Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8227196Z test_reduce_stress_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8227424Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82749 2022-11-23T02:48:45.8227645Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82750 2022-11-23T02:48:45.8227862Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 82751 2022-11-23T02:48:45.8228082Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 82752 2022-11-23T02:48:45.8228460Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8228638Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8229246Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8229458Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8229834Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8230014Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8230399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8230592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8230959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8231136Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8231493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8231688Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8232053Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8232229Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8232604Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8232871Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8233145Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5x4lx_do 2022-11-23T02:48:45.8233416Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5x4lx_do/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8233673Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5syzttvi 2022-11-23T02:48:45.8233927Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5syzttvi/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8234160Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8234496Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpb5hkoxim 2022-11-23T02:48:45.8234763Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpb5hkoxim/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8235019Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk7eu5awt 2022-11-23T02:48:45.8235288Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk7eu5awt/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8235522Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8235750Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8235953Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8236056Z ok (6.784s) 2022-11-23T02:48:45.8236076Z 2022-11-23T02:48:45.8236351Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8236468Z Ran 1 test in 6.784s 2022-11-23T02:48:45.8236487Z 2022-11-23T02:48:45.8236579Z OK 2022-11-23T02:48:45.8236597Z 2022-11-23T02:48:45.8236720Z Generating XML reports... 2022-11-23T02:48:45.8237166Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024709.xml 2022-11-23T02:48:45.8237545Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8237722Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8238092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8238286Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8238544Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpnmxmjs9z 2022-11-23T02:48:45.8238819Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpnmxmjs9z/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8238842Z 2022-11-23T02:48:45.8238954Z Running tests... 2022-11-23T02:48:45.8239222Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8239542Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8239780Z test_round_robin (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8239982Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 82960 2022-11-23T02:48:45.8240202Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 82961 2022-11-23T02:48:45.8240422Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 82962 2022-11-23T02:48:45.8240634Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 82963 2022-11-23T02:48:45.8241011Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8241189Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8241575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8241818Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8242199Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8242356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8242740Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8242930Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8243304Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8243530Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8243914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8244104Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8244476Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8244631Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8245010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8245199Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8245459Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp3d8_uvs8 2022-11-23T02:48:45.8245730Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp3d8_uvs8/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8246021Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8246282Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp4u8h33s7 2022-11-23T02:48:45.8246554Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp4u8h33s7/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8246784Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8247023Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmplsl2eh7l 2022-11-23T02:48:45.8247290Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmplsl2eh7l/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8247546Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa_mvpkmq 2022-11-23T02:48:45.8247812Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa_mvpkmq/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8248044Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8248273Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8248850Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:48:45.8249409Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:48:45.8249952Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:48:45.8250554Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:48:45.8250666Z ok (4.122s) 2022-11-23T02:48:45.8250686Z 2022-11-23T02:48:45.8250962Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8251056Z Ran 1 test in 4.122s 2022-11-23T02:48:45.8251076Z 2022-11-23T02:48:45.8251167Z OK 2022-11-23T02:48:45.8251186Z 2022-11-23T02:48:45.8251310Z Generating XML reports... 2022-11-23T02:48:45.8251748Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024718.xml 2022-11-23T02:48:45.8252179Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8252356Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8252744Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8252939Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8253181Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpyljmxchx 2022-11-23T02:48:45.8253455Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpyljmxchx/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8253475Z 2022-11-23T02:48:45.8253585Z Running tests... 2022-11-23T02:48:45.8253852Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8254168Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8254437Z test_round_robin_create_destroy (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8254662Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83155 2022-11-23T02:48:45.8254887Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83156 2022-11-23T02:48:45.8255110Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 83157 2022-11-23T02:48:45.8255309Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 83158 2022-11-23T02:48:45.8255688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8255865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8256254Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8256453Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8256828Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8257007Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8257392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8257564Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8257934Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8258104Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8258524Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8258715Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8259089Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8259264Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8259692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8259888Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8260131Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpwwto9too 2022-11-23T02:48:45.8260407Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpwwto9too/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8260663Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpa9d3u0sg 2022-11-23T02:48:45.8260935Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpa9d3u0sg/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8261267Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8261500Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8261765Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppgl2ahsw 2022-11-23T02:48:45.8262037Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppgl2ahsw/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8262250Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8262508Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpw8ba5d9n 2022-11-23T02:48:45.8262779Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpw8ba5d9n/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8263009Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8263576Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:48:45.8264124Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:48:45.8264676Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:48:45.8265226Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:48:45.8265780Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:48:45.8266322Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:48:45.8266861Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:48:45.8267449Z [W ProcessGroupRoundRobin.cpp:12] Warning: ProcessGroupRoundRobin is deprecated and scheduled to be removed after this current release (1.13). Please file an issue on https://github.com/pytorch/pytorch/issues if there are any concerns or issues with this deprecation. (function ProcessGroupRoundRobin) 2022-11-23T02:48:45.8267561Z ok (4.377s) 2022-11-23T02:48:45.8267581Z 2022-11-23T02:48:45.8267857Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8267971Z Ran 1 test in 4.377s 2022-11-23T02:48:45.8267990Z 2022-11-23T02:48:45.8268083Z OK 2022-11-23T02:48:45.8268102Z 2022-11-23T02:48:45.8268210Z Generating XML reports... 2022-11-23T02:48:45.8268649Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024725.xml 2022-11-23T02:48:45.8269327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8269514Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8269911Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8270107Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8270370Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx6wu68bs 2022-11-23T02:48:45.8270641Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx6wu68bs/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8270662Z 2022-11-23T02:48:45.8270752Z Running tests... 2022-11-23T02:48:45.8271019Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8271336Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8271594Z test_scatter_basics (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8271815Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83374 2022-11-23T02:48:45.8272038Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83375 2022-11-23T02:48:45.8272256Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 83376 2022-11-23T02:48:45.8272473Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 83377 2022-11-23T02:48:45.8272850Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8273011Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8273386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8273566Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8273953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8274150Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8274533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8274727Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8275098Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8275259Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8275640Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8275828Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8276200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8276374Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8276829Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8277036Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8277297Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpoajaayw3 2022-11-23T02:48:45.8277571Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpoajaayw3/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8277785Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8278042Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp1eh_66z2 2022-11-23T02:48:45.8278370Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp1eh_66z2/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8278626Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpxkgr_dr8 2022-11-23T02:48:45.8278898Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpxkgr_dr8/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8279155Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp563zwudz 2022-11-23T02:48:45.8279425Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp563zwudz/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8279658Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8279866Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8280095Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8280200Z ok (4.111s) 2022-11-23T02:48:45.8280220Z 2022-11-23T02:48:45.8280495Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8280612Z Ran 1 test in 4.111s 2022-11-23T02:48:45.8280631Z 2022-11-23T02:48:45.8280723Z OK 2022-11-23T02:48:45.8280742Z 2022-11-23T02:48:45.8280870Z Generating XML reports... 2022-11-23T02:48:45.8281310Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024732.xml 2022-11-23T02:48:45.8281689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8281850Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8282239Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8282432Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8282697Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp58fpea8d 2022-11-23T02:48:45.8282970Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp58fpea8d/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8282989Z 2022-11-23T02:48:45.8283098Z Running tests... 2022-11-23T02:48:45.8283370Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8283689Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8283930Z test_scatter_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8284152Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83557 2022-11-23T02:48:45.8284373Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83558 2022-11-23T02:48:45.8284589Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 83559 2022-11-23T02:48:45.8284801Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 83560 2022-11-23T02:48:45.8285182Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8285360Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8285799Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8285980Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8286352Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8286529Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8286909Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8287101Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8287522Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8287698Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8288079Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8288271Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8288627Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8288799Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8289178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8289369Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8289632Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpo7c1x49k 2022-11-23T02:48:45.8289911Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpo7c1x49k/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8290144Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8290404Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpl56tdof3 2022-11-23T02:48:45.8290658Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmph3ab_u4t 2022-11-23T02:48:45.8290910Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpl56tdof3/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8291172Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmph3ab_u4t/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8291426Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp8jck1w9t 2022-11-23T02:48:45.8291689Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp8jck1w9t/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8291924Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8292155Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8292392Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8292495Z ok (5.858s) 2022-11-23T02:48:45.8292516Z 2022-11-23T02:48:45.8292768Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8292881Z Ran 1 test in 5.858s 2022-11-23T02:48:45.8292900Z 2022-11-23T02:48:45.8292991Z OK 2022-11-23T02:48:45.8293010Z 2022-11-23T02:48:45.8293133Z Generating XML reports... 2022-11-23T02:48:45.8293575Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024738.xml 2022-11-23T02:48:45.8293951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8294133Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8294518Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8294773Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8295020Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp34sk35j3 2022-11-23T02:48:45.8295291Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp34sk35j3/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8295311Z 2022-11-23T02:48:45.8295419Z Running tests... 2022-11-23T02:48:45.8295688Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8296005Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8296256Z test_scatter_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8296548Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83744 2022-11-23T02:48:45.8296768Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83745 2022-11-23T02:48:45.8296973Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 83746 2022-11-23T02:48:45.8297188Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 83747 2022-11-23T02:48:45.8297571Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8297748Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8298131Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8298327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8298702Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8298880Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8299241Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8299434Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8299800Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8299976Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8300351Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8300541Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8300915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8301093Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8301468Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8301643Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8301903Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmphioa7nxe 2022-11-23T02:48:45.8302177Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmphioa7nxe/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8302436Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpk7ouayvg 2022-11-23T02:48:45.8302710Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpk7ouayvg/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8302944Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8303181Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8303438Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp0nqpylnp 2022-11-23T02:48:45.8303757Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp0nqpylnp/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8303976Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8304233Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpev_1o7wd 2022-11-23T02:48:45.8304500Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpev_1o7wd/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8304732Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8304834Z ok (4.126s) 2022-11-23T02:48:45.8304854Z 2022-11-23T02:48:45.8305126Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8305288Z Ran 1 test in 4.126s 2022-11-23T02:48:45.8305308Z 2022-11-23T02:48:45.8305401Z OK 2022-11-23T02:48:45.8305420Z 2022-11-23T02:48:45.8305526Z Generating XML reports... 2022-11-23T02:48:45.8305972Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024746.xml 2022-11-23T02:48:45.8306348Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8306526Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8306917Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8307110Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8307369Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpdsbmwdl2 2022-11-23T02:48:45.8307643Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpdsbmwdl2/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8307666Z 2022-11-23T02:48:45.8307773Z Running tests... 2022-11-23T02:48:45.8308024Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8308345Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8308592Z test_scatter_stress (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8308814Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 83927 2022-11-23T02:48:45.8309269Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 83928 2022-11-23T02:48:45.8309501Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 83929 2022-11-23T02:48:45.8309716Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 83930 2022-11-23T02:48:45.8310101Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8310265Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8310651Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8310847Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8311220Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8311393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8311772Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8311961Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8312331Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8312491Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8312862Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8313116Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8313519Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8313712Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8314092Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8314282Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8314543Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgq1wx20f 2022-11-23T02:48:45.8314818Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgq1wx20f/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8315094Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8315353Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps4za4oij 2022-11-23T02:48:45.8315610Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp7cor0qls 2022-11-23T02:48:45.8315881Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps4za4oij/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8316153Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp7cor0qls/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8316409Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpocetl2oc 2022-11-23T02:48:45.8316678Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpocetl2oc/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8316910Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8317126Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8317357Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8317459Z ok (4.873s) 2022-11-23T02:48:45.8317483Z 2022-11-23T02:48:45.8317756Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8317871Z Ran 1 test in 4.873s 2022-11-23T02:48:45.8317890Z 2022-11-23T02:48:45.8317982Z OK 2022-11-23T02:48:45.8318001Z 2022-11-23T02:48:45.8318125Z Generating XML reports... 2022-11-23T02:48:45.8318566Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024753.xml 2022-11-23T02:48:45.8318937Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8319097Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8319488Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8319680Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8319940Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpi6413y4d 2022-11-23T02:48:45.8320208Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpi6413y4d/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8320228Z 2022-11-23T02:48:45.8320338Z Running tests... 2022-11-23T02:48:45.8320607Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8320922Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8321222Z test_scatter_stress_cuda (__main__.ProcessGroupGlooTest) ... skip: Test is flaky, see https://github.com/pytorch/pytorch/issues/15963 (0.001s) 2022-11-23T02:48:45.8321260Z 2022-11-23T02:48:45.8321506Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8321620Z Ran 1 test in 0.001s 2022-11-23T02:48:45.8321639Z 2022-11-23T02:48:45.8321747Z OK (skipped=1) 2022-11-23T02:48:45.8321766Z 2022-11-23T02:48:45.8321888Z Generating XML reports... 2022-11-23T02:48:45.8322374Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024800.xml 2022-11-23T02:48:45.8322759Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8322936Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8323323Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8323499Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8323758Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpuwavmk7g 2022-11-23T02:48:45.8324080Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpuwavmk7g/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8324100Z 2022-11-23T02:48:45.8324207Z Running tests... 2022-11-23T02:48:45.8324479Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8324796Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8325054Z test_send_recv_all_to_all (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8325276Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84167 2022-11-23T02:48:45.8325497Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84168 2022-11-23T02:48:45.8325697Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 84169 2022-11-23T02:48:45.8325912Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 84170 2022-11-23T02:48:45.8326295Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8326471Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8326859Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8327054Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8327424Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8327598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8327957Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8328151Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8328520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8328700Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8329080Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8329271Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8329641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8329813Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8330188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8330359Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8330620Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpzf0o0a3y 2022-11-23T02:48:45.8330896Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpzf0o0a3y/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8331153Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpcn_w3khx 2022-11-23T02:48:45.8331469Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpcn_w3khx/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8331710Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8331966Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpgr8rcuqs 2022-11-23T02:48:45.8332235Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpgr8rcuqs/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8332470Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5vzoy3hn 2022-11-23T02:48:45.8332739Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5vzoy3hn/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8333018Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8333249Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8333484Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8333586Z ok (4.167s) 2022-11-23T02:48:45.8333606Z 2022-11-23T02:48:45.8333879Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8333992Z Ran 1 test in 4.168s 2022-11-23T02:48:45.8334011Z 2022-11-23T02:48:45.8334103Z OK 2022-11-23T02:48:45.8334122Z 2022-11-23T02:48:45.8334228Z Generating XML reports... 2022-11-23T02:48:45.8334668Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024802.xml 2022-11-23T02:48:45.8335042Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8335225Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8335612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8335807Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8336068Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpfjakdmlg 2022-11-23T02:48:45.8336344Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpfjakdmlg/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8336363Z 2022-11-23T02:48:45.8336472Z Running tests... 2022-11-23T02:48:45.8336718Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8337032Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8337310Z test_sparse_allreduce_basics (__main__.ProcessGroupGlooTest) ... skip: intermittent failures on Windows, in CI (0.000s) 2022-11-23T02:48:45.8337333Z 2022-11-23T02:48:45.8337597Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8337710Z Ran 1 test in 0.001s 2022-11-23T02:48:45.8337729Z 2022-11-23T02:48:45.8337837Z OK (skipped=1) 2022-11-23T02:48:45.8337856Z 2022-11-23T02:48:45.8337984Z Generating XML reports... 2022-11-23T02:48:45.8338418Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024809.xml 2022-11-23T02:48:45.8338773Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8338951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8339336Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8339529Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8339790Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpohvcuczj 2022-11-23T02:48:45.8340066Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpohvcuczj/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8340086Z 2022-11-23T02:48:45.8340195Z Running tests... 2022-11-23T02:48:45.8340508Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8340834Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8341087Z test_sparse_allreduce_basics_cuda (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8341310Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84383 2022-11-23T02:48:45.8341532Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84384 2022-11-23T02:48:45.8341751Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 84385 2022-11-23T02:48:45.8341965Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 84386 2022-11-23T02:48:45.8342395Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8342567Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8342954Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8343130Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8343500Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8343672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8344050Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8344244Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8344616Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8344792Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8345173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8345360Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8345717Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8345890Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8346273Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8346463Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8346729Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpv5g3kqgs 2022-11-23T02:48:45.8347004Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpv5g3kqgs/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8347264Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpigi5ud3f 2022-11-23T02:48:45.8347535Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpigi5ud3f/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8347773Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmppdq4sjtr 2022-11-23T02:48:45.8348044Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmppdq4sjtr/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8348275Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8348502Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8348760Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp6nt5otwu 2022-11-23T02:48:45.8349267Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp6nt5otwu/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8349509Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8349810Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8349922Z ok (6.048s) 2022-11-23T02:48:45.8349944Z 2022-11-23T02:48:45.8350200Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8350315Z Ran 1 test in 6.049s 2022-11-23T02:48:45.8350335Z 2022-11-23T02:48:45.8350427Z OK 2022-11-23T02:48:45.8350447Z 2022-11-23T02:48:45.8350571Z Generating XML reports... 2022-11-23T02:48:45.8351012Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024811.xml 2022-11-23T02:48:45.8351387Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8351630Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8352021Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8352200Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8352458Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp9u84r53k 2022-11-23T02:48:45.8352729Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp9u84r53k/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8352749Z 2022-11-23T02:48:45.8352858Z Running tests... 2022-11-23T02:48:45.8353127Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8353440Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8353706Z test_sparse_allreduce_checks (__main__.ProcessGroupGlooTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8353931Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 84750 2022-11-23T02:48:45.8354155Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 84751 2022-11-23T02:48:45.8354359Z INFO:torch.testing._internal.common_distributed:Started process 2 with pid 84752 2022-11-23T02:48:45.8354575Z INFO:torch.testing._internal.common_distributed:Started process 3 with pid 84753 2022-11-23T02:48:45.8354953Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8355130Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8355516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8355710Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8356084Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8356258Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8356621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8356811Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8357186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8357361Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8357737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8357926Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8358293Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8358519Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8358900Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8359121Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8359390Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpobctd51w 2022-11-23T02:48:45.8359664Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpobctd51w/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8359923Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpirzu4e63 2022-11-23T02:48:45.8360194Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpirzu4e63/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8360450Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpy4_o8c2t 2022-11-23T02:48:45.8360779Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpy4_o8c2t/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8361014Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:48:45.8361231Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 3 2022-11-23T02:48:45.8361458Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 2 2022-11-23T02:48:45.8361710Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp5rahltt6 2022-11-23T02:48:45.8361979Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp5rahltt6/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8362208Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:48:45.8362310Z ok (4.175s) 2022-11-23T02:48:45.8362330Z 2022-11-23T02:48:45.8362605Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8362721Z Ran 1 test in 4.175s 2022-11-23T02:48:45.8362740Z 2022-11-23T02:48:45.8362813Z OK 2022-11-23T02:48:45.8362851Z 2022-11-23T02:48:45.8362957Z Generating XML reports... 2022-11-23T02:48:45.8363397Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024819.xml 2022-11-23T02:48:45.8363774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8363950Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8364337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8364531Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8364788Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpd473vrwe 2022-11-23T02:48:45.8365061Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpd473vrwe/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8365080Z 2022-11-23T02:48:45.8365170Z Running tests... 2022-11-23T02:48:45.8365436Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8365755Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8365929Z test_forward_backward (__main__.ReducerTest) ... ok (0.008s) 2022-11-23T02:48:45.8365948Z 2022-11-23T02:48:45.8366205Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8366315Z Ran 1 test in 0.012s 2022-11-23T02:48:45.8366335Z 2022-11-23T02:48:45.8366427Z OK 2022-11-23T02:48:45.8366446Z 2022-11-23T02:48:45.8366573Z Generating XML reports... 2022-11-23T02:48:45.8366973Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123024826.xml 2022-11-23T02:48:45.8367327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8367509Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8367893Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8368135Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8368402Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpqoyph5n7 2022-11-23T02:48:45.8368677Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpqoyph5n7/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8368698Z 2022-11-23T02:48:45.8368807Z Running tests... 2022-11-23T02:48:45.8369076Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8369371Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8370246Z test_forward_backward_optimizer (__main__.ReducerTest) ... [W reducer.cpp:1305] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) 2022-11-23T02:48:45.8370401Z ok (0.012s) 2022-11-23T02:48:45.8370421Z 2022-11-23T02:48:45.8370691Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8370803Z Ran 1 test in 0.022s 2022-11-23T02:48:45.8370823Z 2022-11-23T02:48:45.8370914Z OK 2022-11-23T02:48:45.8370933Z 2022-11-23T02:48:45.8371057Z Generating XML reports... 2022-11-23T02:48:45.8371460Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123024828.xml 2022-11-23T02:48:45.8371841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8372019Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8372409Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8372585Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8372845Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpp3o_pnq9 2022-11-23T02:48:45.8373117Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpp3o_pnq9/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8373136Z 2022-11-23T02:48:45.8373245Z Running tests... 2022-11-23T02:48:45.8373511Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8373825Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8374037Z test_forward_backward_unused_parameters (__main__.ReducerTest) ... ok (0.009s) 2022-11-23T02:48:45.8374057Z 2022-11-23T02:48:45.8374317Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8374413Z Ran 1 test in 0.012s 2022-11-23T02:48:45.8374433Z 2022-11-23T02:48:45.8374525Z OK 2022-11-23T02:48:45.8374544Z 2022-11-23T02:48:45.8374666Z Generating XML reports... 2022-11-23T02:48:45.8375067Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123024830.xml 2022-11-23T02:48:45.8375441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8375616Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8375999Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8376196Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8376455Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpscd8f_ts 2022-11-23T02:48:45.8376758Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpscd8f_ts/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8376780Z 2022-11-23T02:48:45.8376895Z Running tests... 2022-11-23T02:48:45.8377164Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8377480Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8377667Z test_multi_dtype_multi_bucket (__main__.ReducerTest) ... ok (0.004s) 2022-11-23T02:48:45.8377687Z 2022-11-23T02:48:45.8377950Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8378061Z Ran 1 test in 0.012s 2022-11-23T02:48:45.8378080Z 2022-11-23T02:48:45.8378173Z OK 2022-11-23T02:48:45.8378193Z 2022-11-23T02:48:45.8378346Z Generating XML reports... 2022-11-23T02:48:45.8378748Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123024832.xml 2022-11-23T02:48:45.8379127Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8379308Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8379692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8379883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8380141Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp30o8ca19 2022-11-23T02:48:45.8380414Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp30o8ca19/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8380434Z 2022-11-23T02:48:45.8380542Z Running tests... 2022-11-23T02:48:45.8380793Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8381107Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8381297Z test_multi_dtype_single_bucket (__main__.ReducerTest) ... ok (0.006s) 2022-11-23T02:48:45.8381319Z 2022-11-23T02:48:45.8381578Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8381691Z Ran 1 test in 0.011s 2022-11-23T02:48:45.8381711Z 2022-11-23T02:48:45.8381801Z OK 2022-11-23T02:48:45.8381820Z 2022-11-23T02:48:45.8381944Z Generating XML reports... 2022-11-23T02:48:45.8382339Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123024835.xml 2022-11-23T02:48:45.8382694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8382874Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8383260Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8383457Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8383720Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmp39epiav3 2022-11-23T02:48:45.8383993Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmp39epiav3/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8384012Z 2022-11-23T02:48:45.8384120Z Running tests... 2022-11-23T02:48:45.8384383Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8384696Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8384870Z test_single_dtype_single_bucket (__main__.ReducerTest) ... ok (0.003s) 2022-11-23T02:48:45.8384889Z 2022-11-23T02:48:45.8385151Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8385264Z Ran 1 test in 0.012s 2022-11-23T02:48:45.8385285Z 2022-11-23T02:48:45.8385375Z OK 2022-11-23T02:48:45.8385394Z 2022-11-23T02:48:45.8385516Z Generating XML reports... 2022-11-23T02:48:45.8385913Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123024837.xml 2022-11-23T02:48:45.8386335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8386521Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8386892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8387088Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8387348Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmpx11k5ugz 2022-11-23T02:48:45.8387619Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmpx11k5ugz/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8387683Z 2022-11-23T02:48:45.8387798Z Running tests... 2022-11-23T02:48:45.8388065Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8388385Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8388619Z test_logging_init (__main__.RendezvousEnvTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8388868Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:48:45.8389576Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes. 2022-11-23T02:48:45.8389685Z ok (1.631s) 2022-11-23T02:48:45.8389706Z 2022-11-23T02:48:45.8389973Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8390085Z Ran 1 test in 1.631s 2022-11-23T02:48:45.8390104Z 2022-11-23T02:48:45.8390198Z OK 2022-11-23T02:48:45.8390221Z 2022-11-23T02:48:45.8390347Z Generating XML reports... 2022-11-23T02:48:45.8390766Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-RendezvousEnvTest-20221123024839.xml 2022-11-23T02:48:45.8391145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:48:45.8391306Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:48:45.8391689Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:48:45.8391883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:48:45.8392143Z INFO:torch.distributed.nn.jit.instantiator:Created a temporary directory at /tmp/tmps6l90qe2 2022-11-23T02:48:45.8392415Z INFO:torch.distributed.nn.jit.instantiator:Writing /tmp/tmps6l90qe2/_remote_module_non_scriptable.py 2022-11-23T02:48:45.8392434Z 2022-11-23T02:48:45.8392545Z Running tests... 2022-11-23T02:48:45.8392813Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8393126Z Test results will be stored in test-reports/python-unittest/distributed.test_c10d_gloo 2022-11-23T02:48:45.8393371Z test_default_store_timeout_gloo (__main__.TimeoutTest) ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:48:45.8394109Z skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/74714 for allplatform(s) . If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (1.643s) 2022-11-23T02:48:45.8394149Z 2022-11-23T02:48:45.8394396Z ---------------------------------------------------------------------- 2022-11-23T02:48:45.8394509Z Ran 1 test in 1.643s 2022-11-23T02:48:45.8394528Z 2022-11-23T02:48:45.8394636Z OK (skipped=1) 2022-11-23T02:48:45.8394655Z 2022-11-23T02:48:45.8394777Z Generating XML reports... 2022-11-23T02:48:45.8395185Z Generated XML report: test-reports/python-unittest/distributed.test_c10d_gloo/TEST-TimeoutTest-20221123024843.xml 2022-11-23T02:48:45.8395205Z 2022-11-23T02:48:45.8395653Z ##[endgroup] 2022-11-23T02:48:45.8396177Z FINISHED PRINTING LOG FILE of distributed/test_c10d_gloo (/var/lib/jenkins/workspace/test/test-reports/distributed-test_c10d_gloo_7u35nqr_) 2022-11-23T02:48:45.8396200Z 2022-11-23T02:48:45.8396478Z Running distributed/fsdp/test_fsdp_core ... [2022-11-23 02:48:45.681427] 2022-11-23T02:48:45.8396934Z Executing ['/opt/conda/bin/python', '-bb', 'distributed/fsdp/test_fsdp_core.py', '-v', '--import-slow-tests', '--import-disabled-tests'] ... [2022-11-23 02:48:45.681731] 2022-11-23T02:58:19.6233899Z 2022-11-23T02:58:19.6237776Z Expand the folded group to see the log file of distributed/fsdp/test_fsdp_core 2022-11-23T02:58:19.6238713Z ##[group]PRINTING LOG FILE of distributed/fsdp/test_fsdp_core (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_core_dav8tr5u) 2022-11-23T02:58:19.6265847Z 2022-11-23T02:58:19.6266116Z Running tests... 2022-11-23T02:58:19.6266656Z ---------------------------------------------------------------------- 2022-11-23T02:58:19.6268813Z Test results will be stored in test-reports/python-unittest/distributed.fsdp.test_fsdp_core 2022-11-23T02:58:19.6269862Z test_pre_backward_hook_registration_after_state_dict (__main__.TestHooks) 2022-11-23T02:58:19.6270498Z Tests that FSDP pre-backward hooks are registered on forward pass ... INFO:numba.cuda.cudadrv.driver:init 2022-11-23T02:58:19.6272579Z INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85273 2022-11-23T02:58:19.6273063Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85274 2022-11-23T02:58:19.6273732Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6274205Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6274807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6275280Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6275886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6276354Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6277189Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6278013Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6278881Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.6280496Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.6281805Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6283032Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6284010Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.6284889Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.6287296Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6288813Z warnings.warn( 2022-11-23T02:58:19.6291231Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6292719Z warnings.warn( 2022-11-23T02:58:19.6293091Z dist init r=0, world=2 2022-11-23T02:58:19.6293603Z dist init r=1, world=2 2022-11-23T02:58:19.6294035Z ok (6.479s) 2022-11-23T02:58:19.6294446Z test_pre_backward_hook_registration_cuda_first_False (__main__.TestHooks) 2022-11-23T02:58:19.6295151Z Tests that FSDP pre-backward hooks are registered on forward pass ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85356 2022-11-23T02:58:19.6295841Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85357 2022-11-23T02:58:19.6296462Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6296930Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6297539Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6298020Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6298598Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6299086Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6299922Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6300667Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6301126Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.6301638Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.6302431Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6303375Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6303916Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.6304436Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.6305730Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6306546Z warnings.warn( 2022-11-23T02:58:19.6307693Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6308497Z warnings.warn( 2022-11-23T02:58:19.6308914Z dist init r=1, world=2 2022-11-23T02:58:19.6309558Z dist init r=0, world=2 2022-11-23T02:58:19.6309784Z ok (4.712s) 2022-11-23T02:58:19.6310128Z test_pre_backward_hook_registration_cuda_first_True (__main__.TestHooks) 2022-11-23T02:58:19.6310823Z Tests that FSDP pre-backward hooks are registered on forward pass ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85439 2022-11-23T02:58:19.6311485Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85440 2022-11-23T02:58:19.6312116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6312583Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6313188Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6313652Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6314512Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6315393Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6316426Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6317288Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6318120Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.6319046Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.6320157Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6321423Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6322393Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.6323299Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.6323817Z dist init r=0, world=2 2022-11-23T02:58:19.6324256Z dist init r=1, world=2 2022-11-23T02:58:19.6324679Z ok (4.712s) 2022-11-23T02:58:19.6325318Z test_register_functions_called_cuda_first_False_mixed_precision_False (__main__.TestHooks) 2022-11-23T02:58:19.6326193Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85522 2022-11-23T02:58:19.6327160Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85523 2022-11-23T02:58:19.6328222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6328876Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6329457Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6329947Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6330540Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6331001Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6332106Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6332971Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6333799Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.6334586Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.6335594Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6336802Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6337706Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.6338616Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.6341663Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6342481Z warnings.warn( 2022-11-23T02:58:19.6343656Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6344526Z warnings.warn( 2022-11-23T02:58:19.6344783Z dist init r=0, world=2 2022-11-23T02:58:19.6345019Z dist init r=1, world=2 2022-11-23T02:58:19.6345267Z ok (4.813s) 2022-11-23T02:58:19.6345625Z test_register_functions_called_cuda_first_False_mixed_precision_True (__main__.TestHooks) 2022-11-23T02:58:19.6346163Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85601 2022-11-23T02:58:19.6346710Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85602 2022-11-23T02:58:19.6347349Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6347820Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6348403Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6349424Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6350587Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6351416Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6352306Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6352797Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6353269Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.6353911Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.6354852Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6355833Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6356714Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.6357545Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.6359617Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:58:19.6360568Z warnings.warn( 2022-11-23T02:58:19.6361714Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:58:19.6362442Z warnings.warn( 2022-11-23T02:58:19.6363624Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6365093Z warnings.warn( 2022-11-23T02:58:19.6367260Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6368702Z warnings.warn( 2022-11-23T02:58:19.6369159Z dist init r=0, world=2 2022-11-23T02:58:19.6369474Z dist init r=1, world=2 2022-11-23T02:58:19.6369699Z ok (4.713s) 2022-11-23T02:58:19.6370245Z test_register_functions_called_cuda_first_True_mixed_precision_False (__main__.TestHooks) 2022-11-23T02:58:19.6371274Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85680 2022-11-23T02:58:19.6372248Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85681 2022-11-23T02:58:19.6373434Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6374266Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6375343Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6376147Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6377200Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6378051Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6379166Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6379861Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6380507Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.6381426Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.6382651Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6383923Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6384925Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.6385805Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.6386160Z dist init r=1, world=2 2022-11-23T02:58:19.6386431Z dist init r=0, world=2 2022-11-23T02:58:19.6386676Z ok (4.813s) 2022-11-23T02:58:19.6387024Z test_register_functions_called_cuda_first_True_mixed_precision_True (__main__.TestHooks) 2022-11-23T02:58:19.6387656Z Tests that ``_register_{pre|post}_backward_hooks()`` are called ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85759 2022-11-23T02:58:19.6388212Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85760 2022-11-23T02:58:19.6388851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6389930Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6390543Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6391020Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6391614Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6392376Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6393019Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6393508Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6393961Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.6394475Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.6395148Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6395864Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6396379Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.6396871Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.6398024Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:58:19.6398746Z warnings.warn( 2022-11-23T02:58:19.6400127Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:58:19.6400819Z warnings.warn( 2022-11-23T02:58:19.6401070Z dist init r=1, world=2 2022-11-23T02:58:19.6401329Z dist init r=0, world=2 2022-11-23T02:58:19.6401549Z ok (4.813s) 2022-11-23T02:58:19.6401889Z test_transformer_no_grad_mixed_precision_False (__main__.TestNoGrad) 2022-11-23T02:58:19.6402566Z Tests that for an FSDP-wrapped transformer model with shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85838 2022-11-23T02:58:19.6403127Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85839 2022-11-23T02:58:19.6403918Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6404739Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6405721Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6406638Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6407599Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6408539Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6409641Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6410479Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6411342Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.6412219Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.6413458Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6414798Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6415782Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.6416683Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.6419116Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6420203Z warnings.warn( 2022-11-23T02:58:19.6421384Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6422184Z warnings.warn( 2022-11-23T02:58:19.6422439Z dist init r=1, world=2 2022-11-23T02:58:19.6422694Z dist init r=0, world=2 2022-11-23T02:58:19.6422922Z ok (4.812s) 2022-11-23T02:58:19.6423248Z test_transformer_no_grad_mixed_precision_True (__main__.TestNoGrad) 2022-11-23T02:58:19.6423923Z Tests that for an FSDP-wrapped transformer model with shared ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 85921 2022-11-23T02:58:19.6424478Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 85922 2022-11-23T02:58:19.6425078Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6425553Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6426144Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6426621Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6427222Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6427684Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6428535Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6429549Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6430144Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.6430663Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.6431348Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6432139Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6432696Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.6433187Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.6434346Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:58:19.6435130Z warnings.warn( 2022-11-23T02:58:19.6436326Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6437128Z warnings.warn( 2022-11-23T02:58:19.6439016Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:58:19.6440351Z warnings.warn( 2022-11-23T02:58:19.6442586Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6444031Z warnings.warn( 2022-11-23T02:58:19.6444460Z dist init r=1, world=2 2022-11-23T02:58:19.6444890Z dist init r=0, world=2 2022-11-23T02:58:19.6445304Z ok (4.814s) 2022-11-23T02:58:19.6445823Z test_param_change_after_init_mixed_precision_False (__main__.TestParamInit) 2022-11-23T02:58:19.6447114Z Tests that changing FSDP model parameter values in-place after FSDP ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86004 2022-11-23T02:58:19.6448138Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86005 2022-11-23T02:58:19.6449285Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6450154Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6451270Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6452149Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6453216Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6454052Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6455168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6456050Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6456858Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.6457786Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.6459028Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6460349Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6461308Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.6462171Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.6464578Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6466114Z warnings.warn( 2022-11-23T02:58:19.6468129Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6469888Z warnings.warn( 2022-11-23T02:58:19.6470304Z dist init r=0, world=2 2022-11-23T02:58:19.6470764Z dist init r=1, world=2 2022-11-23T02:58:19.6471185Z ok (4.712s) 2022-11-23T02:58:19.6471759Z test_param_change_after_init_mixed_precision_True (__main__.TestParamInit) 2022-11-23T02:58:19.6472995Z Tests that changing FSDP model parameter values in-place after FSDP ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86083 2022-11-23T02:58:19.6474015Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86084 2022-11-23T02:58:19.6475145Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6475983Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6477083Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6477909Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6478906Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6479784Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6480831Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6481748Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6482556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.6483476Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.6484479Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6485706Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6486616Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.6487494Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.6489745Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:58:19.6491073Z warnings.warn( 2022-11-23T02:58:19.6492932Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:68: UserWarning: Both mixed precision and an `auto_wrap_policy` were specified for FSDP, where the wrapped module has batch norm submodules. The batch norm submodules will be wrapped as separate FSDP instances with mixed precision disabled since some batch norm kernels do not support low precision. 2022-11-23T02:58:19.6493815Z warnings.warn( 2022-11-23T02:58:19.6495016Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6495802Z warnings.warn( 2022-11-23T02:58:19.6496983Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6497765Z warnings.warn( 2022-11-23T02:58:19.6498002Z dist init r=0, world=2 2022-11-23T02:58:19.6498265Z dist init r=1, world=2 2022-11-23T02:58:19.6498510Z ok (4.812s) 2022-11-23T02:58:19.6498841Z test_delayed_optim_step_offload_false_no_shard (__main__.TestParityWithDDP) 2022-11-23T02:58:19.6499392Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86162 2022-11-23T02:58:19.6499944Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86163 2022-11-23T02:58:19.6500560Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6501031Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6501621Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6502111Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6502682Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6503150Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6503745Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6504222Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6504667Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.6505181Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.6505864Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6506580Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6507094Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.6507640Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.6508146Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6508632Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6510590Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6511506Z warnings.warn( 2022-11-23T02:58:19.6512687Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6513462Z warnings.warn( 2022-11-23T02:58:19.6513838Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6514327Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6514822Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6515459Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6516014Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6516492Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6516988Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6517472Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6517936Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6518422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6518909Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6519390Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6519851Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6520341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6520823Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6521288Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6521769Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6522248Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6523555Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.6524651Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:58:19.6525991Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.6526911Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:58:19.6527444Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6528124Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6528621Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6529092Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6529581Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6530070Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6530530Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6531011Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6531501Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6531986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6532458Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6532951Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6533441Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6533926Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6534393Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6534873Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6535358Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6535820Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6536299Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6536780Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6537259Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6537715Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6538208Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6538690Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6539718Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.6540986Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.6541817Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6542324Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6542837Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6543298Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6543670Z dist init r=0, world=2 2022-11-23T02:58:19.6543927Z dist init r=1, world=2 2022-11-23T02:58:19.6544167Z ok (17.131s) 2022-11-23T02:58:19.6544494Z test_delayed_optim_step_offload_false_none (__main__.TestParityWithDDP) 2022-11-23T02:58:19.6545043Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86245 2022-11-23T02:58:19.6545649Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86246 2022-11-23T02:58:19.6546267Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6546742Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6547334Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6547820Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6548399Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6548860Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6550066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6550563Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6551016Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.6551525Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.6552196Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6552882Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6553424Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.6553909Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.6554397Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6554881Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6556192Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6556984Z warnings.warn( 2022-11-23T02:58:19.6558168Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6558952Z warnings.warn( 2022-11-23T02:58:19.6559403Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6559921Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6560412Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6560895Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6561373Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6561861Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6562346Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6562916Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6563383Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6563874Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6564356Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6564811Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6565286Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6565776Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6566257Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6566724Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6567218Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6567697Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6568728Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.6570297Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.6571208Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:58:19.6571664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6572156Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6572852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6573331Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6573821Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6574309Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6574787Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6575249Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6575731Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6576278Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6576750Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6577232Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6577714Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6578196Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6578652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6579186Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6579734Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6580226Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6580695Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6581181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6581658Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6582112Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6582586Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6583067Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6584291Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.6585064Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6585533Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6586018Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6586494Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6586843Z dist init r=1, world=2 2022-11-23T02:58:19.6587105Z dist init r=0, world=2 2022-11-23T02:58:19.6587349Z ok (29.349s) 2022-11-23T02:58:19.6587686Z test_delayed_optim_step_offload_false_shard_grad_op (__main__.TestParityWithDDP) 2022-11-23T02:58:19.6588252Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86328 2022-11-23T02:58:19.6588806Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86329 2022-11-23T02:58:19.6589865Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6590317Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6590910Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6591395Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6591986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6592421Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6593018Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6593496Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6594032Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.6594561Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.6595245Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6595949Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6596463Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.6596951Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.6597514Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6598017Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6599304Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6600121Z warnings.warn( 2022-11-23T02:58:19.6601294Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6602090Z warnings.warn( 2022-11-23T02:58:19.6602475Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6602948Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6603443Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6603925Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6604422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6604888Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6605418Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6605904Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6606392Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6606855Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6607340Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6607816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6608271Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6608755Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6609240Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6609725Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6610186Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6610716Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6611748Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.6613309Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.6614262Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:58:19.6614716Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6615214Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6615705Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6616199Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6616662Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6617144Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6617631Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6618102Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6618589Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6619078Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6619558Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6620013Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6620498Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6620977Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6621442Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6621920Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6622411Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6622876Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6623359Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6623840Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6624319Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6624772Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6625251Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6625727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6626810Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.6627551Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6628040Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6628523Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6629888Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6630323Z dist init r=0, world=2 2022-11-23T02:58:19.6630582Z dist init r=1, world=2 2022-11-23T02:58:19.6630827Z ok (22.941s) 2022-11-23T02:58:19.6631159Z test_delayed_optim_step_offload_true_no_shard (__main__.TestParityWithDDP) 2022-11-23T02:58:19.6632494Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82490 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T02:58:19.6633294Z test_delayed_optim_step_offload_true_none (__main__.TestParityWithDDP) 2022-11-23T02:58:19.6633844Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86411 2022-11-23T02:58:19.6634373Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86412 2022-11-23T02:58:19.6635002Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6635467Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6636067Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6636537Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6637138Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.6637594Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.6638186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.6638651Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.6639116Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.6639626Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.6640281Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6641253Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.6641818Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.6642308Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.6642778Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6643277Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6644584Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6645382Z warnings.warn( 2022-11-23T02:58:19.6646630Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.6647420Z warnings.warn( 2022-11-23T02:58:19.6647675Z File "", line 1, in 2022-11-23T02:58:19.6648056Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6648497Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6648856Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6649234Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6649637Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6649961Z self.run() 2022-11-23T02:58:19.6650356Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6650730Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6651264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6651651Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6652191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6652590Z getattr(self, test_name)() 2022-11-23T02:58:19.6653104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6653484Z fn() 2022-11-23T02:58:19.6653980Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6654394Z test(self, **param_kwargs) 2022-11-23T02:58:19.6654902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6655304Z return func(*args, **kwargs) 2022-11-23T02:58:19.6655719Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6656083Z self.run_subtests( 2022-11-23T02:58:19.6656594Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6657028Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6657590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6657999Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6658568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6658973Z output = model(*input) 2022-11-23T02:58:19.6659446Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6659845Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6660404Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6660871Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6661433Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6661835Z _lazy_init(state, module) 2022-11-23T02:58:19.6662348Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6662744Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6663387Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6663786Z return func(*args, **kwargs) 2022-11-23T02:58:19.6664339Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6664716Z p_assert( 2022-11-23T02:58:19.6665192Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6665578Z traceback.print_stack() 2022-11-23T02:58:19.6665856Z File "", line 1, in 2022-11-23T02:58:19.6666234Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6666673Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6667035Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6667415Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6667814Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6668157Z self.run() 2022-11-23T02:58:19.6668480Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6668852Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6669927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6670308Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6670853Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6671255Z getattr(self, test_name)() 2022-11-23T02:58:19.6671793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6672152Z fn() 2022-11-23T02:58:19.6672654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6673062Z test(self, **param_kwargs) 2022-11-23T02:58:19.6673566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6673969Z return func(*args, **kwargs) 2022-11-23T02:58:19.6674381Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6674745Z self.run_subtests( 2022-11-23T02:58:19.6675251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6675684Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6676249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6676665Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6677234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6677647Z output = model(*input) 2022-11-23T02:58:19.6678113Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6678511Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6679062Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6679529Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6680086Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6680498Z _lazy_init(state, module) 2022-11-23T02:58:19.6681011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6681426Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6682030Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6682434Z return func(*args, **kwargs) 2022-11-23T02:58:19.6682985Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6683360Z p_assert( 2022-11-23T02:58:19.6683840Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6684231Z traceback.print_stack() 2022-11-23T02:58:19.6684615Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6685200Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6685585Z File "", line 1, in 2022-11-23T02:58:19.6685964Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6686328Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6686710Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6687090Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6687467Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6687812Z self.run() 2022-11-23T02:58:19.6688155Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6688511Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6689047Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6689454Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6689996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6690379Z getattr(self, test_name)() 2022-11-23T02:58:19.6690907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6691280Z fn() 2022-11-23T02:58:19.6691767Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6692175Z test(self, **param_kwargs) 2022-11-23T02:58:19.6692697Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6693095Z return func(*args, **kwargs) 2022-11-23T02:58:19.6693488Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6693876Z self.run_subtests( 2022-11-23T02:58:19.6694156Z File "", line 1, in 2022-11-23T02:58:19.6694699Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6695145Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6695711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6696146Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6696527Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6696900Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6697463Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6697887Z output = model(*input) 2022-11-23T02:58:19.6698413Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6698804Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6699335Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6699715Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6700168Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6700526Z self.run() 2022-11-23T02:58:19.6701035Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6701508Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6701918Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6702295Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6702828Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6703288Z _lazy_init(state, module) 2022-11-23T02:58:19.6703793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6704172Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6704700Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6705114Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6705642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6706039Z getattr(self, test_name)() 2022-11-23T02:58:19.6706545Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6706935Z return func(*args, **kwargs) 2022-11-23T02:58:19.6707443Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6707828Z fn() 2022-11-23T02:58:19.6708344Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6708720Z p_assert( 2022-11-23T02:58:19.6709574Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6709983Z test(self, **param_kwargs) 2022-11-23T02:58:19.6710490Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6710857Z traceback.print_stack() 2022-11-23T02:58:19.6711391Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6711796Z return func(*args, **kwargs) 2022-11-23T02:58:19.6712184Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6712575Z self.run_subtests( 2022-11-23T02:58:19.6713084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6713520Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6714066Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6714494Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6715064Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6715453Z output = model(*input) 2022-11-23T02:58:19.6715938Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6716332Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6716887Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6717339Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6717921Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6718405Z _lazy_init(state, module) 2022-11-23T02:58:19.6718916Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6719339Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6719864Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6720254Z return func(*args, **kwargs) 2022-11-23T02:58:19.6720778Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6721168Z p_assert( 2022-11-23T02:58:19.6721646Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6722085Z traceback.print_stack() 2022-11-23T02:58:19.6722491Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6722992Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6723376Z File "", line 1, in 2022-11-23T02:58:19.6723737Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6724116Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6724495Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6724852Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6725244Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6725585Z self.run() 2022-11-23T02:58:19.6725904Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6726280Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6726807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6727205Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6727731Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6728134Z getattr(self, test_name)() 2022-11-23T02:58:19.6728662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6729016Z fn() 2022-11-23T02:58:19.6729516Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6729921Z test(self, **param_kwargs) 2022-11-23T02:58:19.6730446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6730830Z return func(*args, **kwargs) 2022-11-23T02:58:19.6731241Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6731625Z self.run_subtests( 2022-11-23T02:58:19.6732120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6732551Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6733111Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6733538Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6734078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6734479Z output = model(*input) 2022-11-23T02:58:19.6734962Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6735339Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6735637Z File "", line 1, in 2022-11-23T02:58:19.6736236Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6736711Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6737269Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6737668Z _lazy_init(state, module) 2022-11-23T02:58:19.6738035Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6738393Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6738921Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6739390Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6739752Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6740133Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6740663Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6741052Z return func(*args, **kwargs) 2022-11-23T02:58:19.6741406Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6741745Z self.run() 2022-11-23T02:58:19.6742266Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6742636Z p_assert( 2022-11-23T02:58:19.6742974Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6743352Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6743848Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6744238Z traceback.print_stack() 2022-11-23T02:58:19.6744738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6745132Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6745645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6746043Z getattr(self, test_name)() 2022-11-23T02:58:19.6746566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6746918Z fn() 2022-11-23T02:58:19.6747413Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6747807Z test(self, **param_kwargs) 2022-11-23T02:58:19.6748326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6748707Z return func(*args, **kwargs) 2022-11-23T02:58:19.6749277Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6749664Z self.run_subtests( 2022-11-23T02:58:19.6750203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6750640Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6751201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6751625Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6752168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6752570Z output = model(*input) 2022-11-23T02:58:19.6753049Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6753426Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6753978Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6754513Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6755106Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6755487Z _lazy_init(state, module) 2022-11-23T02:58:19.6755993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6756406Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6756908Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6757292Z return func(*args, **kwargs) 2022-11-23T02:58:19.6757907Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6758277Z p_assert( 2022-11-23T02:58:19.6758755Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6759152Z traceback.print_stack() 2022-11-23T02:58:19.6759552Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6760029Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6760415Z File "", line 1, in 2022-11-23T02:58:19.6760793Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6761152Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6761529Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6761901Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6762300Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6762625Z self.run() 2022-11-23T02:58:19.6762965Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6763334Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6763844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6764237Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6764770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6765146Z getattr(self, test_name)() 2022-11-23T02:58:19.6765670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6766044Z fn() 2022-11-23T02:58:19.6766540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6766926Z test(self, **param_kwargs) 2022-11-23T02:58:19.6767449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6767847Z return func(*args, **kwargs) 2022-11-23T02:58:19.6768235Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6768611Z self.run_subtests( 2022-11-23T02:58:19.6769119Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6769547Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6770083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6770503Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6771071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6771452Z output = model(*input) 2022-11-23T02:58:19.6771985Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6772385Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6772665Z File "", line 1, in 2022-11-23T02:58:19.6773221Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6773681Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6774256Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6774633Z _lazy_init(state, module) 2022-11-23T02:58:19.6774998Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6775435Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6775949Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6776362Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6776745Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6777124Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6777632Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6778017Z return func(*args, **kwargs) 2022-11-23T02:58:19.6778390Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6778710Z self.run() 2022-11-23T02:58:19.6779230Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6779616Z p_assert( 2022-11-23T02:58:19.6779937Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6780315Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6780839Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6781227Z traceback.print_stack() 2022-11-23T02:58:19.6781709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6782100Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6782625Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6783005Z getattr(self, test_name)() 2022-11-23T02:58:19.6783527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6783900Z fn() 2022-11-23T02:58:19.6784393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6784780Z test(self, **param_kwargs) 2022-11-23T02:58:19.6785295Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6785695Z return func(*args, **kwargs) 2022-11-23T02:58:19.6786092Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6786468Z self.run_subtests( 2022-11-23T02:58:19.6786976Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6787389Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6787942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6788367Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6789151Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6789556Z output = model(*input) 2022-11-23T02:58:19.6790119Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6790523Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6791064Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6791529Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6792105Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6792509Z _lazy_init(state, module) 2022-11-23T02:58:19.6793005Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6793488Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6794013Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6794378Z return func(*args, **kwargs) 2022-11-23T02:58:19.6794923Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6795312Z p_assert( 2022-11-23T02:58:19.6795790Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6796158Z traceback.print_stack() 2022-11-23T02:58:19.6796559Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6797059Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6797427Z File "", line 1, in 2022-11-23T02:58:19.6797802Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6798183Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6798542Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6798916Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6799308Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6799647Z self.run() 2022-11-23T02:58:19.6799968Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6800343Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6800869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6801242Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6801776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6802175Z getattr(self, test_name)() 2022-11-23T02:58:19.6802704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6803059Z fn() 2022-11-23T02:58:19.6803559Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6803962Z test(self, **param_kwargs) 2022-11-23T02:58:19.6804463Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6804864Z return func(*args, **kwargs) 2022-11-23T02:58:19.6805272Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6805632Z self.run_subtests( 2022-11-23T02:58:19.6806136Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6806563Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6807122Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6807529Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6808140Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6808552Z output = model(*input) 2022-11-23T02:58:19.6809020Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6809410Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6809965Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6810429Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6810988Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6811461Z _lazy_init(state, module) 2022-11-23T02:58:19.6811975Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6812371Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6812898Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6813282Z return func(*args, **kwargs) 2022-11-23T02:58:19.6813826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6814195Z p_assert( 2022-11-23T02:58:19.6814669Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6815056Z traceback.print_stack() 2022-11-23T02:58:19.6815327Z File "", line 1, in 2022-11-23T02:58:19.6815704Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6816084Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6816440Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6816815Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6817213Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6817552Z self.run() 2022-11-23T02:58:19.6817874Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6818244Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6818762Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6819136Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6819665Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6820063Z getattr(self, test_name)() 2022-11-23T02:58:19.6820586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6820940Z fn() 2022-11-23T02:58:19.6821438Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6821838Z test(self, **param_kwargs) 2022-11-23T02:58:19.6822335Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6822733Z return func(*args, **kwargs) 2022-11-23T02:58:19.6823141Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6823502Z self.run_subtests( 2022-11-23T02:58:19.6824010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6824440Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6825002Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6825409Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6826021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6826430Z output = model(*input) 2022-11-23T02:58:19.6826900Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6827292Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6827848Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6828314Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6828874Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6829585Z _lazy_init(state, module) 2022-11-23T02:58:19.6830101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6830498Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6831022Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6831405Z return func(*args, **kwargs) 2022-11-23T02:58:19.6831947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6832317Z p_assert( 2022-11-23T02:58:19.6832787Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6833175Z traceback.print_stack() 2022-11-23T02:58:19.6833560Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6834063Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6834451Z File "", line 1, in 2022-11-23T02:58:19.6834813Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6835192Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6835570Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6835947Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6836323Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6836659Z self.run() 2022-11-23T02:58:19.6836998Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6837347Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6837871Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6838269Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6838810Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6839190Z getattr(self, test_name)() 2022-11-23T02:58:19.6839720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6840092Z fn() 2022-11-23T02:58:19.6840571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6840972Z test(self, **param_kwargs) 2022-11-23T02:58:19.6841490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6841869Z return func(*args, **kwargs) 2022-11-23T02:58:19.6842275Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6842657Z self.run_subtests( 2022-11-23T02:58:19.6842936Z File "", line 1, in 2022-11-23T02:58:19.6843440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6843947Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6844519Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6844929Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6845318Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6845696Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6846252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6846636Z output = model(*input) 2022-11-23T02:58:19.6846990Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6847441Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6847940Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6848337Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6848722Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6849049Z self.run() 2022-11-23T02:58:19.6849566Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6850029Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6850481Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6850839Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6851387Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6851790Z _lazy_init(state, module) 2022-11-23T02:58:19.6852269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6852665Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6853189Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6853598Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6854120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6854515Z getattr(self, test_name)() 2022-11-23T02:58:19.6855018Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6855388Z return func(*args, **kwargs) 2022-11-23T02:58:19.6855908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6856284Z fn() 2022-11-23T02:58:19.6856775Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6857162Z p_assert( 2022-11-23T02:58:19.6857667Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6858069Z test(self, **param_kwargs) 2022-11-23T02:58:19.6858551Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6858934Z traceback.print_stack() 2022-11-23T02:58:19.6859459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6859838Z return func(*args, **kwargs) 2022-11-23T02:58:19.6860244Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6860628Z self.run_subtests( 2022-11-23T02:58:19.6861135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6861546Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6862158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6862594Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6863149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6863550Z output = model(*input) 2022-11-23T02:58:19.6864031Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6864416Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6864957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6865485Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6866063Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6866447Z _lazy_init(state, module) 2022-11-23T02:58:19.6866962Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6867376Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6867724Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6867850Z return func(*args, **kwargs) 2022-11-23T02:58:19.6868219Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6868324Z p_assert( 2022-11-23T02:58:19.6868670Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6868798Z traceback.print_stack() 2022-11-23T02:58:19.6869263Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6869516Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6869650Z File "", line 1, in 2022-11-23T02:58:19.6869869Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6869996Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6870201Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6870354Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6870570Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6870676Z self.run() 2022-11-23T02:58:19.6870884Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6871038Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6871392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6871513Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6871885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6872009Z getattr(self, test_name)() 2022-11-23T02:58:19.6872379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6872477Z fn() 2022-11-23T02:58:19.6872846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6872969Z test(self, **param_kwargs) 2022-11-23T02:58:19.6873313Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6873443Z return func(*args, **kwargs) 2022-11-23T02:58:19.6873698Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6873812Z self.run_subtests( 2022-11-23T02:58:19.6874243Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6874418Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6874795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6874950Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6875315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6875437Z output = model(*input) 2022-11-23T02:58:19.6875835Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6875978Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6876363Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6876547Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6949660Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6949885Z _lazy_init(state, module) 2022-11-23T02:58:19.6950360Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6950491Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6950845Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6950969Z return func(*args, **kwargs) 2022-11-23T02:58:19.6951374Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6951477Z p_assert( 2022-11-23T02:58:19.6951819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6951944Z traceback.print_stack() 2022-11-23T02:58:19.6952056Z File "", line 1, in 2022-11-23T02:58:19.6952273Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6952417Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6952615Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6952763Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6952970Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6953068Z self.run() 2022-11-23T02:58:19.6953266Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6953395Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6953743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6953873Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6954240Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6954355Z getattr(self, test_name)() 2022-11-23T02:58:19.6954718Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6954807Z fn() 2022-11-23T02:58:19.6955174Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6955281Z test(self, **param_kwargs) 2022-11-23T02:58:19.6955636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6955759Z return func(*args, **kwargs) 2022-11-23T02:58:19.6956003Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6956305Z self.run_subtests( 2022-11-23T02:58:19.6956680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6956837Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6957201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6957338Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6957715Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6957827Z output = model(*input) 2022-11-23T02:58:19.6958233Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6958367Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6958749Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6958922Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6959283Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6959388Z _lazy_init(state, module) 2022-11-23T02:58:19.6959746Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6959883Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6960220Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6960351Z return func(*args, **kwargs) 2022-11-23T02:58:19.6960726Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6960822Z p_assert( 2022-11-23T02:58:19.6961155Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6961267Z traceback.print_stack() 2022-11-23T02:58:19.6961505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6961736Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6961854Z File "", line 1, in 2022-11-23T02:58:19.6961978Z File "", line 1, in 2022-11-23T02:58:19.6962193Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6962328Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6962516Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6962670Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6962875Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6963013Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6963225Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6963324Z self.run() 2022-11-23T02:58:19.6963520Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6963666Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6963856Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6963994Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6964206Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6964300Z self.run() 2022-11-23T02:58:19.6964494Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6964634Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6964978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6965220Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6965569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6965691Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6966054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6966173Z getattr(self, test_name)() 2022-11-23T02:58:19.6966533Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6966647Z getattr(self, test_name)() 2022-11-23T02:58:19.6967008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6967140Z fn() 2022-11-23T02:58:19.6967492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6967582Z fn() 2022-11-23T02:58:19.6967959Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6968074Z test(self, **param_kwargs) 2022-11-23T02:58:19.6968434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6968557Z test(self, **param_kwargs) 2022-11-23T02:58:19.6968913Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6969022Z return func(*args, **kwargs) 2022-11-23T02:58:19.6969377Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6969500Z return func(*args, **kwargs) 2022-11-23T02:58:19.6969755Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6969863Z self.run_subtests( 2022-11-23T02:58:19.6970115Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6970219Z self.run_subtests( 2022-11-23T02:58:19.6970571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6970718Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6971067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6971226Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6971598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6971742Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6972103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6972245Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6972621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6972723Z output = model(*input) 2022-11-23T02:58:19.6973088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6973201Z output = model(*input) 2022-11-23T02:58:19.6973523Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6973664Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6974001Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6974141Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6974566Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6974736Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6975117Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6975288Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6975653Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6975767Z _lazy_init(state, module) 2022-11-23T02:58:19.6976131Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6976327Z _lazy_init(state, module) 2022-11-23T02:58:19.6976678Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6976817Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6977160Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6977294Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6977633Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6977760Z return func(*args, **kwargs) 2022-11-23T02:58:19.6978095Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6978215Z return func(*args, **kwargs) 2022-11-23T02:58:19.6978604Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6978693Z p_assert( 2022-11-23T02:58:19.6979062Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6979157Z p_assert( 2022-11-23T02:58:19.6979503Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6979624Z traceback.print_stack() 2022-11-23T02:58:19.6979961Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6980078Z traceback.print_stack() 2022-11-23T02:58:19.6980314Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6980537Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6980659Z File "", line 1, in 2022-11-23T02:58:19.6980872Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6981017Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6981212Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6981363Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6981582Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6981678Z self.run() 2022-11-23T02:58:19.6981865Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6982006Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6982346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6982467Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6982832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6982953Z getattr(self, test_name)() 2022-11-23T02:58:19.6983313Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6983412Z fn() 2022-11-23T02:58:19.6983816Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6983947Z test(self, **param_kwargs) 2022-11-23T02:58:19.6984306Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6984421Z return func(*args, **kwargs) 2022-11-23T02:58:19.6984670Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6984775Z self.run_subtests( 2022-11-23T02:58:19.6985121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6985314Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6985682Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6985838Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6986220Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6986333Z output = model(*input) 2022-11-23T02:58:19.6986658Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6986801Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6986934Z File "", line 1, in 2022-11-23T02:58:19.6987298Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6987477Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6987846Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6987961Z _lazy_init(state, module) 2022-11-23T02:58:19.6988173Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.6988312Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.6988666Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6988802Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6989177Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.6989335Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.6989690Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6989810Z return func(*args, **kwargs) 2022-11-23T02:58:19.6990025Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.6990127Z self.run() 2022-11-23T02:58:19.6990508Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6990600Z p_assert( 2022-11-23T02:58:19.6990793Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.6990935Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.6991272Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6991393Z traceback.print_stack() 2022-11-23T02:58:19.6991727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.6991858Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.6992219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.6992336Z getattr(self, test_name)() 2022-11-23T02:58:19.6992683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.6992777Z fn() 2022-11-23T02:58:19.6993213Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.6993339Z test(self, **param_kwargs) 2022-11-23T02:58:19.6993692Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.6993815Z return func(*args, **kwargs) 2022-11-23T02:58:19.6994062Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.6994165Z self.run_subtests( 2022-11-23T02:58:19.6994505Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.6994730Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.6995095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.6995246Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.6995624Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.6995741Z output = model(*input) 2022-11-23T02:58:19.6996063Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.6996194Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.6996560Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.6996736Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.6997111Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.6997229Z _lazy_init(state, module) 2022-11-23T02:58:19.6997582Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.6997722Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.6998057Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.6998177Z return func(*args, **kwargs) 2022-11-23T02:58:19.6998542Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.6998646Z p_assert( 2022-11-23T02:58:19.6998977Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.6999101Z traceback.print_stack() 2022-11-23T02:58:19.6999336Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6999575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.6999700Z File "", line 1, in 2022-11-23T02:58:19.6999905Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7000032Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7000229Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7000374Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7000500Z File "", line 1, in 2022-11-23T02:58:19.7000709Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7000801Z self.run() 2022-11-23T02:58:19.7001007Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7001136Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7001343Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7001481Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7001826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7002004Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7002210Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7002358Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7002721Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7002826Z getattr(self, test_name)() 2022-11-23T02:58:19.7003035Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7003132Z self.run() 2022-11-23T02:58:19.7003483Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7003620Z fn() 2022-11-23T02:58:19.7003824Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7003964Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7004323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7004435Z test(self, **param_kwargs) 2022-11-23T02:58:19.7004772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7004897Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7005258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7005375Z return func(*args, **kwargs) 2022-11-23T02:58:19.7005734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7005855Z getattr(self, test_name)() 2022-11-23T02:58:19.7006093Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7006203Z self.run_subtests( 2022-11-23T02:58:19.7006561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7006652Z fn() 2022-11-23T02:58:19.7007010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7007166Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7007532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7007655Z test(self, **param_kwargs) 2022-11-23T02:58:19.7008000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7008159Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7008511Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7008630Z return func(*args, **kwargs) 2022-11-23T02:58:19.7009012Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7009132Z output = model(*input) 2022-11-23T02:58:19.7009377Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7009485Z self.run_subtests( 2022-11-23T02:58:19.7009796Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7009931Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7010287Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7010453Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7010826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7011051Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7011433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7011579Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7011932Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7012049Z _lazy_init(state, module) 2022-11-23T02:58:19.7012429Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7012547Z output = model(*input) 2022-11-23T02:58:19.7012979Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7013120Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7013454Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7013587Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7013913Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7014035Z return func(*args, **kwargs) 2022-11-23T02:58:19.7014407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7014581Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7014955Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7015058Z p_assert( 2022-11-23T02:58:19.7015424Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7015540Z _lazy_init(state, module) 2022-11-23T02:58:19.7015867Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7015986Z traceback.print_stack() 2022-11-23T02:58:19.7016341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7016483Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7016817Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7016930Z return func(*args, **kwargs) 2022-11-23T02:58:19.7017311Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7017416Z p_assert( 2022-11-23T02:58:19.7017739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7017858Z traceback.print_stack() 2022-11-23T02:58:19.7018088Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7018328Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7018452Z File "", line 1, in 2022-11-23T02:58:19.7018664Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7018809Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7019007Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7019140Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7019353Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7019450Z self.run() 2022-11-23T02:58:19.7019654Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7019794Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7020139Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7020315Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7020686Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7020792Z getattr(self, test_name)() 2022-11-23T02:58:19.7021147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7021242Z fn() 2022-11-23T02:58:19.7021605Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7021726Z test(self, **param_kwargs) 2022-11-23T02:58:19.7022130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7022252Z return func(*args, **kwargs) 2022-11-23T02:58:19.7022490Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7022608Z self.run_subtests( 2022-11-23T02:58:19.7022960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7023118Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7023477Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7023621Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7023989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7024106Z output = model(*input) 2022-11-23T02:58:19.7024431Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7024556Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7024936Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7025107Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7025232Z File "", line 1, in 2022-11-23T02:58:19.7025595Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7025706Z _lazy_init(state, module) 2022-11-23T02:58:19.7026059Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7026203Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7026399Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7026536Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7026871Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7026995Z return func(*args, **kwargs) 2022-11-23T02:58:19.7027204Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7027349Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7027725Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7027809Z p_assert( 2022-11-23T02:58:19.7028015Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7028115Z self.run() 2022-11-23T02:58:19.7028451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7028567Z traceback.print_stack() 2022-11-23T02:58:19.7028773Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7028912Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7029673Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7029798Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7030175Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7030292Z getattr(self, test_name)() 2022-11-23T02:58:19.7030646Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7030738Z fn() 2022-11-23T02:58:19.7031097Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7031220Z test(self, **param_kwargs) 2022-11-23T02:58:19.7031645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7031752Z return func(*args, **kwargs) 2022-11-23T02:58:19.7032000Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7032109Z self.run_subtests( 2022-11-23T02:58:19.7032461Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7032620Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7032989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7033135Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7033511Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7033617Z output = model(*input) 2022-11-23T02:58:19.7033950Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7034084Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7034468Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7034647Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7035010Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7035125Z _lazy_init(state, module) 2022-11-23T02:58:19.7035475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7035601Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7035937Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7036061Z return func(*args, **kwargs) 2022-11-23T02:58:19.7036438Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7036532Z p_assert( 2022-11-23T02:58:19.7036868Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7036991Z traceback.print_stack() 2022-11-23T02:58:19.7037234Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7037458Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7037581Z File "", line 1, in 2022-11-23T02:58:19.7037791Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7037927Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7038119Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7038275Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7038480Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7038564Z self.run() 2022-11-23T02:58:19.7038815Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7038963Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7039310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7039441Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7039804Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7039922Z getattr(self, test_name)() 2022-11-23T02:58:19.7040282Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7040424Z fn() 2022-11-23T02:58:19.7040797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7040920Z test(self, **param_kwargs) 2022-11-23T02:58:19.7041275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7041402Z return func(*args, **kwargs) 2022-11-23T02:58:19.7041528Z File "", line 1, in 2022-11-23T02:58:19.7041777Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7041887Z self.run_subtests( 2022-11-23T02:58:19.7042232Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7042389Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7042598Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7042737Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7043104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7043249Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7043448Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7043598Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7043964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7044078Z output = model(*input) 2022-11-23T02:58:19.7044287Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7044383Z self.run() 2022-11-23T02:58:19.7044711Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7044848Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7045051Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7045178Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7045558Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7045731Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7046066Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7046197Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7046560Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7046678Z _lazy_init(state, module) 2022-11-23T02:58:19.7047039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7047163Z getattr(self, test_name)() 2022-11-23T02:58:19.7047501Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7047641Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7048048Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7048149Z fn() 2022-11-23T02:58:19.7048488Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7048613Z return func(*args, **kwargs) 2022-11-23T02:58:19.7048978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7049083Z test(self, **param_kwargs) 2022-11-23T02:58:19.7049466Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7049615Z p_assert( 2022-11-23T02:58:19.7049972Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7050096Z return func(*args, **kwargs) 2022-11-23T02:58:19.7050492Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7050613Z traceback.print_stack() 2022-11-23T02:58:19.7050861Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7050955Z self.run_subtests( 2022-11-23T02:58:19.7051308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7051462Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7051824Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7051974Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7052349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7052465Z output = model(*input) 2022-11-23T02:58:19.7052792Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7052916Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7053289Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7053466Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7053839Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7053952Z _lazy_init(state, module) 2022-11-23T02:58:19.7054305Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7054446Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7054781Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7054891Z return func(*args, **kwargs) 2022-11-23T02:58:19.7055269Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7055368Z p_assert( 2022-11-23T02:58:19.7055702Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7055819Z traceback.print_stack() 2022-11-23T02:58:19.7056056Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7056295Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7056534Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7056757Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7056991Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7057265Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7057505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7057741Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7057967Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7058194Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7058419Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7058692Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7059706Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.7059940Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:58:19.7060949Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.7061185Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:58:19.7061427Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7061656Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7061891Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7062124Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7062349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7062574Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7062804Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7063021Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7063244Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7063465Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7063692Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7063920Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7064143Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7064368Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7064587Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7064798Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7065030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7065253Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7065534Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7065761Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7065987Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7066210Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7066435Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7066656Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7066911Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7067139Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7067367Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7067594Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7067813Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7068037Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7068260Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7068487Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7068696Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7069139Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7069391Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7069626Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7069741Z dist init r=1, world=2 2022-11-23T02:58:19.7070082Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7070414Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7070745Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7071069Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7071390Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7071705Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7072037Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7072361Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7072686Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7072997Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7073380Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7073713Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7073828Z dist init r=0, world=2 2022-11-23T02:58:19.7074142Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7074445Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7074814Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7075112Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7075413Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7075729Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7076043Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7076355Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7076674Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7076986Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7077291Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7077607Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7077704Z ok (35.861s) 2022-11-23T02:58:19.7077921Z test_delayed_optim_step_offload_true_shard_grad_op (__main__.TestParityWithDDP) 2022-11-23T02:58:19.7078223Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86494 2022-11-23T02:58:19.7078448Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86495 2022-11-23T02:58:19.7078841Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.7079012Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.7079402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.7079595Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.7079968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.7080143Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.7080521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.7080742Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.7081004Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.7081250Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.7081662Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.7082063Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.7082294Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.7082588Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.7082833Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7083049Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7084097Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.7084209Z warnings.warn( 2022-11-23T02:58:19.7085230Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.7085342Z warnings.warn( 2022-11-23T02:58:19.7085465Z File "", line 1, in 2022-11-23T02:58:19.7085686Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7085831Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7086039Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7086188Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7086405Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7086493Z self.run() 2022-11-23T02:58:19.7086701Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7086840Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7087191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7087328Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7087691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7087814Z getattr(self, test_name)() 2022-11-23T02:58:19.7088166Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7088259Z fn() 2022-11-23T02:58:19.7088633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7088758Z test(self, **param_kwargs) 2022-11-23T02:58:19.7089116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7089237Z return func(*args, **kwargs) 2022-11-23T02:58:19.7089486Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7089651Z self.run_subtests( 2022-11-23T02:58:19.7090002Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7090163Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7090289Z File "", line 1, in 2022-11-23T02:58:19.7090651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7090802Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7091177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7091344Z output = model(*input) 2022-11-23T02:58:19.7091561Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7091688Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7092023Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7092160Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7092361Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7092509Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7092888Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7093063Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7093281Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7093372Z self.run() 2022-11-23T02:58:19.7093740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7093859Z _lazy_init(state, module) 2022-11-23T02:58:19.7094064Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7094210Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7094562Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7094703Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7095041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7095157Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7095498Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7095625Z return func(*args, **kwargs) 2022-11-23T02:58:19.7095990Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7096112Z getattr(self, test_name)() 2022-11-23T02:58:19.7096500Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7096600Z p_assert( 2022-11-23T02:58:19.7096966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7097046Z fn() 2022-11-23T02:58:19.7097390Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7097515Z traceback.print_stack() 2022-11-23T02:58:19.7097884Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7098003Z test(self, **param_kwargs) 2022-11-23T02:58:19.7098365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7098495Z return func(*args, **kwargs) 2022-11-23T02:58:19.7098746Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7098887Z self.run_subtests( 2022-11-23T02:58:19.7099250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7099408Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7099772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7099920Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7100296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7100416Z output = model(*input) 2022-11-23T02:58:19.7100859Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7100985Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7101365Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7101544Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7101915Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7102035Z _lazy_init(state, module) 2022-11-23T02:58:19.7102392Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7102531Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7102866Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7102978Z return func(*args, **kwargs) 2022-11-23T02:58:19.7103361Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7103463Z p_assert( 2022-11-23T02:58:19.7103806Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7103932Z traceback.print_stack() 2022-11-23T02:58:19.7104167Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7104409Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7104538Z File "", line 1, in 2022-11-23T02:58:19.7104734Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7104875Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7105073Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7105222Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7105435Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7105532Z self.run() 2022-11-23T02:58:19.7105735Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7105864Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7106209Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7106342Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7106709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7106833Z getattr(self, test_name)() 2022-11-23T02:58:19.7107193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7107290Z fn() 2022-11-23T02:58:19.7107654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7107760Z test(self, **param_kwargs) 2022-11-23T02:58:19.7108169Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7108299Z return func(*args, **kwargs) 2022-11-23T02:58:19.7108550Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7108664Z self.run_subtests( 2022-11-23T02:58:19.7109258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7109432Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7109803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7110020Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7110406Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7110523Z output = model(*input) 2022-11-23T02:58:19.7110854Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7110993Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7111372Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7111554Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7111924Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7112027Z _lazy_init(state, module) 2022-11-23T02:58:19.7112381Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7112529Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7112868Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7112989Z return func(*args, **kwargs) 2022-11-23T02:58:19.7113382Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7113481Z p_assert( 2022-11-23T02:58:19.7113822Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7113930Z traceback.print_stack() 2022-11-23T02:58:19.7114060Z File "", line 1, in 2022-11-23T02:58:19.7114268Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7114408Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7114614Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7114766Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7114981Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7115066Z self.run() 2022-11-23T02:58:19.7115274Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7115418Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7115765Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7115895Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7116258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7116385Z getattr(self, test_name)() 2022-11-23T02:58:19.7116746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7116829Z fn() 2022-11-23T02:58:19.7117202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7117321Z test(self, **param_kwargs) 2022-11-23T02:58:19.7117744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7117879Z return func(*args, **kwargs) 2022-11-23T02:58:19.7118130Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7118242Z self.run_subtests( 2022-11-23T02:58:19.7118604Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7118750Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7119115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7119313Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7119698Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7119817Z output = model(*input) 2022-11-23T02:58:19.7120143Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7120285Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7120665Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7120824Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7121198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7121319Z _lazy_init(state, module) 2022-11-23T02:58:19.7121676Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7121821Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7122166Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7122290Z return func(*args, **kwargs) 2022-11-23T02:58:19.7122668Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7122753Z p_assert( 2022-11-23T02:58:19.7123094Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7123217Z traceback.print_stack() 2022-11-23T02:58:19.7123459Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7123700Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7123827Z File "", line 1, in 2022-11-23T02:58:19.7124044Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7124184Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7124370Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7124523Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7124744Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7124845Z self.run() 2022-11-23T02:58:19.7125050Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7125198Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7125540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7125672Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7126024Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7126154Z getattr(self, test_name)() 2022-11-23T02:58:19.7126517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7126614Z fn() 2022-11-23T02:58:19.7127039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7127166Z test(self, **param_kwargs) 2022-11-23T02:58:19.7127528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7127635Z return func(*args, **kwargs) 2022-11-23T02:58:19.7127892Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7128005Z self.run_subtests( 2022-11-23T02:58:19.7128366Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7128576Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7128949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7129099Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7129486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7129588Z output = model(*input) 2022-11-23T02:58:19.7129915Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7130054Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7130442Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7130618Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7130991Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7131113Z _lazy_init(state, module) 2022-11-23T02:58:19.7131469Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7131617Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7131946Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7132074Z return func(*args, **kwargs) 2022-11-23T02:58:19.7132465Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7132568Z p_assert( 2022-11-23T02:58:19.7132911Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7133036Z traceback.print_stack() 2022-11-23T02:58:19.7133167Z File "", line 1, in 2022-11-23T02:58:19.7133368Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7133511Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7133714Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7133868Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7134085Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7134189Z self.run() 2022-11-23T02:58:19.7134393Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7134541Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7134872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7135005Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7135373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7135502Z getattr(self, test_name)() 2022-11-23T02:58:19.7135869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7135965Z fn() 2022-11-23T02:58:19.7136383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7136506Z test(self, **param_kwargs) 2022-11-23T02:58:19.7136855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7136983Z return func(*args, **kwargs) 2022-11-23T02:58:19.7137236Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7137350Z self.run_subtests( 2022-11-23T02:58:19.7137706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7137921Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7138290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7138446Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7138812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7138934Z output = model(*input) 2022-11-23T02:58:19.7139264Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7139409Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7139789Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7139969Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7140346Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7140465Z _lazy_init(state, module) 2022-11-23T02:58:19.7140805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7140951Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7141291Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7141415Z return func(*args, **kwargs) 2022-11-23T02:58:19.7141798Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7141903Z p_assert( 2022-11-23T02:58:19.7142242Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7142366Z traceback.print_stack() 2022-11-23T02:58:19.7142591Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7142836Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7142969Z File "", line 1, in 2022-11-23T02:58:19.7143099Z File "", line 1, in 2022-11-23T02:58:19.7143319Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7143464Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7143671Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7143810Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7144021Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7144163Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7144382Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7144487Z self.run() 2022-11-23T02:58:19.7144690Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7144842Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7145050Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7145227Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7145448Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7145549Z self.run() 2022-11-23T02:58:19.7145900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7146035Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7146234Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7146378Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7146727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7146904Z getattr(self, test_name)() 2022-11-23T02:58:19.7147249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7147382Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7147753Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7147848Z fn() 2022-11-23T02:58:19.7148216Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7148341Z getattr(self, test_name)() 2022-11-23T02:58:19.7148693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7148816Z test(self, **param_kwargs) 2022-11-23T02:58:19.7149399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7149498Z fn() 2022-11-23T02:58:19.7149859Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7149987Z return func(*args, **kwargs) 2022-11-23T02:58:19.7150393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7150517Z test(self, **param_kwargs) 2022-11-23T02:58:19.7150754Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7150868Z self.run_subtests( 2022-11-23T02:58:19.7151234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7151360Z return func(*args, **kwargs) 2022-11-23T02:58:19.7151719Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7151888Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7152139Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7152248Z self.run_subtests( 2022-11-23T02:58:19.7152605Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7152756Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7153116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7153273Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7153654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7153771Z output = model(*input) 2022-11-23T02:58:19.7154137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7154294Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7154608Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7154822Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7155219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7155341Z output = model(*input) 2022-11-23T02:58:19.7155719Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7155897Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7156229Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7156370Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7156801Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7156921Z _lazy_init(state, module) 2022-11-23T02:58:19.7157305Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7157482Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7157836Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7157981Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7158350Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7158472Z _lazy_init(state, module) 2022-11-23T02:58:19.7158814Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7158925Z return func(*args, **kwargs) 2022-11-23T02:58:19.7159276Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7159417Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7159803Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7159906Z p_assert( 2022-11-23T02:58:19.7160242Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7160366Z return func(*args, **kwargs) 2022-11-23T02:58:19.7160690Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7160817Z traceback.print_stack() 2022-11-23T02:58:19.7161198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7161299Z p_assert( 2022-11-23T02:58:19.7161634Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7161757Z traceback.print_stack() 2022-11-23T02:58:19.7162004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7162244Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7162356Z File "", line 1, in 2022-11-23T02:58:19.7162571Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7162716Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7162924Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7163079Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7163291Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7163397Z self.run() 2022-11-23T02:58:19.7163602Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7163731Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7164127Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7164264Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7164636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7164763Z getattr(self, test_name)() 2022-11-23T02:58:19.7165126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7165224Z fn() 2022-11-23T02:58:19.7165576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7165695Z test(self, **param_kwargs) 2022-11-23T02:58:19.7166119Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7166245Z return func(*args, **kwargs) 2022-11-23T02:58:19.7166505Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7166616Z self.run_subtests( 2022-11-23T02:58:19.7166982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7167145Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7167499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7167655Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7168036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7168157Z output = model(*input) 2022-11-23T02:58:19.7168488Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7168629Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7169013Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7169192Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7169567Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7169672Z _lazy_init(state, module) 2022-11-23T02:58:19.7170028Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7170168Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7170514Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7170644Z return func(*args, **kwargs) 2022-11-23T02:58:19.7171026Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7171128Z p_assert( 2022-11-23T02:58:19.7171477Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7171588Z traceback.print_stack() 2022-11-23T02:58:19.7171715Z File "", line 1, in 2022-11-23T02:58:19.7171928Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7172071Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7172276Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7172428Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7172645Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7172735Z self.run() 2022-11-23T02:58:19.7172936Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7173081Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7173483Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7173619Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7173991Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7174116Z getattr(self, test_name)() 2022-11-23T02:58:19.7174479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7174559Z fn() 2022-11-23T02:58:19.7174929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7175054Z test(self, **param_kwargs) 2022-11-23T02:58:19.7175467Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7175592Z return func(*args, **kwargs) 2022-11-23T02:58:19.7175850Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7175963Z self.run_subtests( 2022-11-23T02:58:19.7176324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7176469Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7176840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7176994Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7177374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7177498Z output = model(*input) 2022-11-23T02:58:19.7177828Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7177967Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7178351Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7178514Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7178884Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7179004Z _lazy_init(state, module) 2022-11-23T02:58:19.7179417Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7179560Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7179909Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7180036Z return func(*args, **kwargs) 2022-11-23T02:58:19.7180422Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7180507Z p_assert( 2022-11-23T02:58:19.7180855Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7180983Z traceback.print_stack() 2022-11-23T02:58:19.7181227Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7181469Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7181596Z File "", line 1, in 2022-11-23T02:58:19.7181810Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7181953Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7182142Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7182297Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7182512Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7182613Z self.run() 2022-11-23T02:58:19.7182866Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7183022Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7183370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7183486Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7183852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7183976Z getattr(self, test_name)() 2022-11-23T02:58:19.7184339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7184483Z fn() 2022-11-23T02:58:19.7184858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7184977Z test(self, **param_kwargs) 2022-11-23T02:58:19.7185344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7185452Z return func(*args, **kwargs) 2022-11-23T02:58:19.7185710Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7185823Z self.run_subtests( 2022-11-23T02:58:19.7186181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7186344Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7186712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7186869Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7187253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7187355Z output = model(*input) 2022-11-23T02:58:19.7187687Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7187828Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7188212Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7188388Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7188757Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7188880Z _lazy_init(state, module) 2022-11-23T02:58:19.7189462Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7189594Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7189938Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7190067Z return func(*args, **kwargs) 2022-11-23T02:58:19.7190453Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7190555Z p_assert( 2022-11-23T02:58:19.7190899Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7191026Z traceback.print_stack() 2022-11-23T02:58:19.7191155Z File "", line 1, in 2022-11-23T02:58:19.7191350Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7191496Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7191706Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7191858Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7192076Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7192186Z self.run() 2022-11-23T02:58:19.7192477Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7192615Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7192964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7193098Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7193464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7193588Z getattr(self, test_name)() 2022-11-23T02:58:19.7193951Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7194112Z fn() 2022-11-23T02:58:19.7194486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7194591Z test(self, **param_kwargs) 2022-11-23T02:58:19.7194954Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7195083Z return func(*args, **kwargs) 2022-11-23T02:58:19.7195337Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7195450Z self.run_subtests( 2022-11-23T02:58:19.7195809Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7195972Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7196343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7196482Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7196864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7196985Z output = model(*input) 2022-11-23T02:58:19.7197318Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7197463Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7197841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7198020Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7198393Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7198496Z _lazy_init(state, module) 2022-11-23T02:58:19.7198853Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7198996Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7199340Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7199466Z return func(*args, **kwargs) 2022-11-23T02:58:19.7199851Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7199951Z p_assert( 2022-11-23T02:58:19.7200289Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7200396Z traceback.print_stack() 2022-11-23T02:58:19.7200639Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7200879Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7201010Z File "", line 1, in 2022-11-23T02:58:19.7201230Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7201374Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7201580Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7201781Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7201898Z File "", line 1, in 2022-11-23T02:58:19.7202118Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7202222Z self.run() 2022-11-23T02:58:19.7202430Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7202577Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7202791Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7202934Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7203313Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7203448Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7203655Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7203812Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7204180Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7204306Z getattr(self, test_name)() 2022-11-23T02:58:19.7204521Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7204627Z self.run() 2022-11-23T02:58:19.7204971Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7205069Z fn() 2022-11-23T02:58:19.7205274Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7205426Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7205798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7205923Z test(self, **param_kwargs) 2022-11-23T02:58:19.7206274Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7206405Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7206754Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7206879Z return func(*args, **kwargs) 2022-11-23T02:58:19.7207244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7207369Z getattr(self, test_name)() 2022-11-23T02:58:19.7207624Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7207743Z self.run_subtests( 2022-11-23T02:58:19.7208104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7208200Z fn() 2022-11-23T02:58:19.7208543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7208709Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7209078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7209202Z test(self, **param_kwargs) 2022-11-23T02:58:19.7209566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7209723Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7210085Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7210215Z return func(*args, **kwargs) 2022-11-23T02:58:19.7210580Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7210700Z output = model(*input) 2022-11-23T02:58:19.7211002Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7211120Z self.run_subtests( 2022-11-23T02:58:19.7211455Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7211602Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7211959Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7212123Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7212487Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7212715Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7213081Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7213238Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7213605Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7213729Z _lazy_init(state, module) 2022-11-23T02:58:19.7214110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7214231Z output = model(*input) 2022-11-23T02:58:19.7214569Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7214715Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7215051Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7215198Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7215546Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7215672Z return func(*args, **kwargs) 2022-11-23T02:58:19.7216057Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7216236Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7216602Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7216704Z p_assert( 2022-11-23T02:58:19.7217079Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7217205Z _lazy_init(state, module) 2022-11-23T02:58:19.7217548Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7217675Z traceback.print_stack() 2022-11-23T02:58:19.7218123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7218268Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7218595Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7218722Z return func(*args, **kwargs) 2022-11-23T02:58:19.7219108Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7219210Z p_assert( 2022-11-23T02:58:19.7219550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7219677Z traceback.print_stack() 2022-11-23T02:58:19.7219924Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7220165Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7220278Z File "", line 1, in 2022-11-23T02:58:19.7220540Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7220689Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7220896Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7221052Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7221271Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7221376Z self.run() 2022-11-23T02:58:19.7221564Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7221713Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7222114Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7222248Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7222620Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7222749Z getattr(self, test_name)() 2022-11-23T02:58:19.7223113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7223211Z fn() 2022-11-23T02:58:19.7223563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7223688Z test(self, **param_kwargs) 2022-11-23T02:58:19.7224045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7224171Z return func(*args, **kwargs) 2022-11-23T02:58:19.7224431Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7224545Z self.run_subtests( 2022-11-23T02:58:19.7224903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7225071Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7225427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7225583Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7225968Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7226089Z output = model(*input) 2022-11-23T02:58:19.7226418Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7226560Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7226945Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7227124Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7227478Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7227601Z _lazy_init(state, module) 2022-11-23T02:58:19.7227957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7228103Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7228446Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7228572Z return func(*args, **kwargs) 2022-11-23T02:58:19.7229196Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7229315Z p_assert( 2022-11-23T02:58:19.7229647Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7229775Z traceback.print_stack() 2022-11-23T02:58:19.7229906Z File "", line 1, in 2022-11-23T02:58:19.7230194Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7230344Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7230553Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7230707Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7230924Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7231010Z self.run() 2022-11-23T02:58:19.7231214Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7231364Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7231797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7231932Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7232303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7232428Z getattr(self, test_name)() 2022-11-23T02:58:19.7232774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7232874Z fn() 2022-11-23T02:58:19.7233247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7233372Z test(self, **param_kwargs) 2022-11-23T02:58:19.7233736Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7233866Z return func(*args, **kwargs) 2022-11-23T02:58:19.7234123Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7234240Z self.run_subtests( 2022-11-23T02:58:19.7234580Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7234747Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7235118Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7235271Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7235652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7235772Z output = model(*input) 2022-11-23T02:58:19.7236107Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7236248Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7236618Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7236800Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7237175Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7237296Z _lazy_init(state, module) 2022-11-23T02:58:19.7237654Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7237802Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7238144Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7238269Z return func(*args, **kwargs) 2022-11-23T02:58:19.7238635Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7238741Z p_assert( 2022-11-23T02:58:19.7239083Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7239210Z traceback.print_stack() 2022-11-23T02:58:19.7239498Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7239745Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7239876Z File "", line 1, in 2022-11-23T02:58:19.7240092Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7240217Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7240424Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7240579Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7240800Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7240956Z self.run() 2022-11-23T02:58:19.7241167Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7241313Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7241665Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7241780Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7242151Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7242275Z getattr(self, test_name)() 2022-11-23T02:58:19.7242640Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7242737Z fn() 2022-11-23T02:58:19.7243110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7243242Z test(self, **param_kwargs) 2022-11-23T02:58:19.7243602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7243710Z return func(*args, **kwargs) 2022-11-23T02:58:19.7243967Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7244082Z self.run_subtests( 2022-11-23T02:58:19.7244440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7244603Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7244974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7245131Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7245512Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7245618Z output = model(*input) 2022-11-23T02:58:19.7245948Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7246090Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7246477Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7246655Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7247027Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7247141Z _lazy_init(state, module) 2022-11-23T02:58:19.7247499Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7247641Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7247982Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7248110Z return func(*args, **kwargs) 2022-11-23T02:58:19.7248496Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7248582Z p_assert( 2022-11-23T02:58:19.7249023Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7249158Z traceback.print_stack() 2022-11-23T02:58:19.7249289Z File "", line 1, in 2022-11-23T02:58:19.7249504Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7249646Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7249850Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7249985Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7250245Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7250402Z self.run() 2022-11-23T02:58:19.7250614Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7250761Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7251115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7251250Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7251618Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7251723Z getattr(self, test_name)() 2022-11-23T02:58:19.7252087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7252183Z fn() 2022-11-23T02:58:19.7252555Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7252681Z test(self, **param_kwargs) 2022-11-23T02:58:19.7253034Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7253158Z return func(*args, **kwargs) 2022-11-23T02:58:19.7253417Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7253513Z self.run_subtests( 2022-11-23T02:58:19.7253868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7254031Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7254399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7254552Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7254934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7255057Z output = model(*input) 2022-11-23T02:58:19.7255385Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7255509Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7255890Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7256068Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7256443Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7256564Z _lazy_init(state, module) 2022-11-23T02:58:19.7256919Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7257062Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7257406Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7257518Z return func(*args, **kwargs) 2022-11-23T02:58:19.7257904Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7258056Z p_assert( 2022-11-23T02:58:19.7258405Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7258531Z traceback.print_stack() 2022-11-23T02:58:19.7258771Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7259011Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7259140Z File "", line 1, in 2022-11-23T02:58:19.7259251Z File "", line 1, in 2022-11-23T02:58:19.7259465Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7259656Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7259861Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7260015Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7260231Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7260374Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7260572Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7260677Z self.run() 2022-11-23T02:58:19.7260878Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7261028Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7261233Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7261380Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7261595Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7261701Z self.run() 2022-11-23T02:58:19.7262034Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7262170Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7262378Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7262527Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7262896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7263018Z getattr(self, test_name)() 2022-11-23T02:58:19.7263357Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7263469Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7263832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7263934Z fn() 2022-11-23T02:58:19.7264302Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7264426Z getattr(self, test_name)() 2022-11-23T02:58:19.7264801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7264922Z test(self, **param_kwargs) 2022-11-23T02:58:19.7265285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7265363Z fn() 2022-11-23T02:58:19.7265722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7265846Z return func(*args, **kwargs) 2022-11-23T02:58:19.7266214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7266341Z test(self, **param_kwargs) 2022-11-23T02:58:19.7266595Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7266709Z self.run_subtests( 2022-11-23T02:58:19.7267120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7267234Z return func(*args, **kwargs) 2022-11-23T02:58:19.7267596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7267760Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7268012Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7268126Z self.run_subtests( 2022-11-23T02:58:19.7268493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7268696Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7269275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7269424Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7269816Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7269943Z output = model(*input) 2022-11-23T02:58:19.7270310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7270464Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7270796Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7270937Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7271320Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7271426Z output = model(*input) 2022-11-23T02:58:19.7271812Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7271992Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7272324Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7272465Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7272839Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7272960Z _lazy_init(state, module) 2022-11-23T02:58:19.7273339Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7273496Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7273858Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7274001Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7274372Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7274495Z _lazy_init(state, module) 2022-11-23T02:58:19.7274836Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7274959Z return func(*args, **kwargs) 2022-11-23T02:58:19.7275312Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7275455Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7275824Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7275932Z p_assert( 2022-11-23T02:58:19.7276276Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7276399Z return func(*args, **kwargs) 2022-11-23T02:58:19.7276813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7276949Z traceback.print_stack() 2022-11-23T02:58:19.7277334Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7277416Z p_assert( 2022-11-23T02:58:19.7277755Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7277884Z traceback.print_stack() 2022-11-23T02:58:19.7278129Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7278370Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7278566Z File "", line 1, in 2022-11-23T02:58:19.7278781Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7278924Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7279113Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7279267Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7279487Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7279590Z self.run() 2022-11-23T02:58:19.7279797Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7279945Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7280292Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7280424Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7280773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7280902Z getattr(self, test_name)() 2022-11-23T02:58:19.7281263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7281365Z fn() 2022-11-23T02:58:19.7281737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7281861Z test(self, **param_kwargs) 2022-11-23T02:58:19.7282221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7282330Z return func(*args, **kwargs) 2022-11-23T02:58:19.7282582Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7282697Z self.run_subtests( 2022-11-23T02:58:19.7283056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7283223Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7283595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7283754Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7284137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7284257Z output = model(*input) 2022-11-23T02:58:19.7284569Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7284712Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7285100Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7285278Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7285653Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7285775Z _lazy_init(state, module) 2022-11-23T02:58:19.7286174Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7286324Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7286654Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7286777Z return func(*args, **kwargs) 2022-11-23T02:58:19.7287165Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7287271Z p_assert( 2022-11-23T02:58:19.7287611Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7287739Z traceback.print_stack() 2022-11-23T02:58:19.7287916Z File "", line 1, in 2022-11-23T02:58:19.7288114Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7288257Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7288463Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7288618Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7288834Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7288939Z self.run() 2022-11-23T02:58:19.7289141Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7289292Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7289624Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7289757Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7290127Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7290254Z getattr(self, test_name)() 2022-11-23T02:58:19.7290619Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7290719Z fn() 2022-11-23T02:58:19.7291090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7291210Z test(self, **param_kwargs) 2022-11-23T02:58:19.7291554Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7291681Z return func(*args, **kwargs) 2022-11-23T02:58:19.7291933Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7292046Z self.run_subtests( 2022-11-23T02:58:19.7292402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7292569Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7292941Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7293096Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7293456Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7293576Z output = model(*input) 2022-11-23T02:58:19.7293905Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7294045Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7294424Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7294604Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7294979Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7295100Z _lazy_init(state, module) 2022-11-23T02:58:19.7295483Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7295633Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7295978Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7296100Z return func(*args, **kwargs) 2022-11-23T02:58:19.7296481Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7296585Z p_assert( 2022-11-23T02:58:19.7296929Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7297120Z traceback.print_stack() 2022-11-23T02:58:19.7297347Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7297588Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7297722Z File "", line 1, in 2022-11-23T02:58:19.7297937Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7298081Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7298283Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7298439Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7298637Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7298742Z self.run() 2022-11-23T02:58:19.7298947Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7299093Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7299446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7299579Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7299952Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7300076Z getattr(self, test_name)() 2022-11-23T02:58:19.7300424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7300522Z fn() 2022-11-23T02:58:19.7300895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7301017Z test(self, **param_kwargs) 2022-11-23T02:58:19.7301374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7301500Z return func(*args, **kwargs) 2022-11-23T02:58:19.7301756Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7301869Z self.run_subtests( 2022-11-23T02:58:19.7302216Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7302380Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7302749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7302902Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7303281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7303400Z output = model(*input) 2022-11-23T02:58:19.7303731Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7303878Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7304243Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7304420Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7304842Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7304970Z _lazy_init(state, module) 2022-11-23T02:58:19.7305330Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7305474Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7305816Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7305943Z return func(*args, **kwargs) 2022-11-23T02:58:19.7306305Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7306457Z p_assert( 2022-11-23T02:58:19.7306802Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7306928Z traceback.print_stack() 2022-11-23T02:58:19.7307060Z File "", line 1, in 2022-11-23T02:58:19.7307275Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7307418Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7307624Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7307758Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7307977Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7308082Z self.run() 2022-11-23T02:58:19.7308285Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7308434Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7308782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7308915Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7309489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7309617Z getattr(self, test_name)() 2022-11-23T02:58:19.7309982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7310080Z fn() 2022-11-23T02:58:19.7310453Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7310576Z test(self, **param_kwargs) 2022-11-23T02:58:19.7310939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7311064Z return func(*args, **kwargs) 2022-11-23T02:58:19.7311305Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 193, in test_delayed_optim_step 2022-11-23T02:58:19.7311420Z self.run_subtests( 2022-11-23T02:58:19.7311780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7311943Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7312311Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7312465Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7312846Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7312965Z output = model(*input) 2022-11-23T02:58:19.7313276Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7313421Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7313805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7313982Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7314424Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7314554Z _lazy_init(state, module) 2022-11-23T02:58:19.7314914Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7315059Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7315382Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7315508Z return func(*args, **kwargs) 2022-11-23T02:58:19.7315890Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7316057Z p_assert( 2022-11-23T02:58:19.7316402Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7316528Z traceback.print_stack() 2022-11-23T02:58:19.7316776Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7317021Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7317235Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7317480Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7317717Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7317947Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7318183Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7318411Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7318650Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7318881Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7319109Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7319320Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7320343Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.7320582Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:58:19.7321595Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.7321832Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:58:19.7322069Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7322303Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7322543Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7322779Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7323063Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7323298Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7323511Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7323745Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7323979Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7324209Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7324483Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7324713Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7324944Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7325177Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7325386Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7325612Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7325842Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7326074Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7326303Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7326533Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7326758Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7326990Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7327221Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7327431Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7327656Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7327884Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7328112Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7328342Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7328573Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7328799Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7329028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7329239Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7329470Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7329697Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7329925Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7330152Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7330265Z dist init r=0, world=2 2022-11-23T02:58:19.7330609Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7330984Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7331329Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7331631Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7331944Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7332275Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7332642Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7332960Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7333291Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7333617Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7333931Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7334262Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7334374Z dist init r=1, world=2 2022-11-23T02:58:19.7334696Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7335007Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7335305Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7335618Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7335931Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7336256Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7336573Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7336885Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7337210Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7337536Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7337896Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7338217Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7338320Z ok (27.547s) 2022-11-23T02:58:19.7338527Z test_delayed_reduce_scatter_offload_false_no_shard (__main__.TestParityWithDDP) 2022-11-23T02:58:19.7338849Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86577 2022-11-23T02:58:19.7339071Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86578 2022-11-23T02:58:19.7339515Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.7339693Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.7340085Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.7340284Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.7340655Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.7340833Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.7341197Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.7341388Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.7341641Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.7341895Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.7356427Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.7356839Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.7357080Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.7357314Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.7357556Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7357776Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7358819Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.7358939Z warnings.warn( 2022-11-23T02:58:19.7359960Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.7360072Z warnings.warn( 2022-11-23T02:58:19.7360311Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7360549Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7360895Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7361144Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7361375Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7361601Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7361816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7362050Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7362281Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7362572Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7362798Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7363034Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7363266Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7363497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7363706Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7363931Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7364160Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7364389Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7365452Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.7365661Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:58:19.7366701Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.7366908Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:58:19.7367144Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7367383Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7367618Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7367832Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7368063Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7368291Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7368521Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7368758Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7368988Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7369218Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7369507Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7369745Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7369957Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7370190Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7370419Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7370645Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7370920Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7371150Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7371384Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7371611Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7371819Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7372050Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7372278Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7372504Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7373278Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7374033Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7374267Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7374500Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7374727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7374956Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7375055Z dist init r=0, world=2 2022-11-23T02:58:19.7375164Z dist init r=1, world=2 2022-11-23T02:58:19.7375262Z ok (5.513s) 2022-11-23T02:58:19.7375479Z test_delayed_reduce_scatter_offload_false_none (__main__.TestParityWithDDP) 2022-11-23T02:58:19.7376403Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82704 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T02:58:19.7376634Z test_delayed_reduce_scatter_offload_false_shard_grad_op (__main__.TestParityWithDDP) 2022-11-23T02:58:19.7377531Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82398 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T02:58:19.7377753Z test_delayed_reduce_scatter_offload_true_no_shard (__main__.TestParityWithDDP) 2022-11-23T02:58:19.7378119Z Tests the FSDP forward, backward, and optimizer step runtime by ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86660 2022-11-23T02:58:19.7378349Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86661 2022-11-23T02:58:19.7378714Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.7378893Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.7379279Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.7379540Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.7379913Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.7380089Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.7380479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.7380671Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.7380925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.7381156Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.7381564Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.7381965Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.7382205Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.7382440Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.7382680Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7382918Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7383957Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.7384074Z warnings.warn( 2022-11-23T02:58:19.7385093Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.7385203Z warnings.warn( 2022-11-23T02:58:19.7385315Z File "", line 1, in 2022-11-23T02:58:19.7385444Z File "", line 1, in 2022-11-23T02:58:19.7385662Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7385805Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7386017Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7386162Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7386368Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7386522Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7386753Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7386909Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7387128Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7387233Z self.run() 2022-11-23T02:58:19.7387443Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7387546Z self.run() 2022-11-23T02:58:19.7387752Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7387881Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7388083Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7388275Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7388628Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7388766Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7389308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7389449Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7389826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7389933Z getattr(self, test_name)() 2022-11-23T02:58:19.7390302Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7390425Z getattr(self, test_name)() 2022-11-23T02:58:19.7390792Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7390896Z fn() 2022-11-23T02:58:19.7391258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7391352Z fn() 2022-11-23T02:58:19.7391724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7391830Z test(self, **param_kwargs) 2022-11-23T02:58:19.7392193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7392316Z test(self, **param_kwargs) 2022-11-23T02:58:19.7392670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7392795Z return func(*args, **kwargs) 2022-11-23T02:58:19.7393158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7393286Z return func(*args, **kwargs) 2022-11-23T02:58:19.7393544Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7393640Z self.run_subtests( 2022-11-23T02:58:19.7393894Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7394005Z self.run_subtests( 2022-11-23T02:58:19.7394362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7394524Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7394882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7395042Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7395412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7395554Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7395927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7396152Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7396550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7396670Z output = model(*input) 2022-11-23T02:58:19.7397040Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7397162Z output = model(*input) 2022-11-23T02:58:19.7397493Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7397618Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7398018Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7398156Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7398546Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7398727Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7399111Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7399290Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7399663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7399765Z _lazy_init(state, module) 2022-11-23T02:58:19.7400137Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7400263Z _lazy_init(state, module) 2022-11-23T02:58:19.7400623Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7400767Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7401125Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7401268Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7401613Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7401719Z return func(*args, **kwargs) 2022-11-23T02:58:19.7402062Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7402189Z return func(*args, **kwargs) 2022-11-23T02:58:19.7402574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7402681Z p_assert( 2022-11-23T02:58:19.7403062Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7403162Z p_assert( 2022-11-23T02:58:19.7403509Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7403618Z traceback.print_stack() 2022-11-23T02:58:19.7403961Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7404086Z traceback.print_stack() 2022-11-23T02:58:19.7404327Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7404568Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7404699Z File "", line 1, in 2022-11-23T02:58:19.7404913Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7405061Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7405249Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7405402Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7405668Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7405779Z self.run() 2022-11-23T02:58:19.7405984Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7406134Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7406481Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7406596Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7406965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7407136Z getattr(self, test_name)() 2022-11-23T02:58:19.7407505Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7407603Z fn() 2022-11-23T02:58:19.7407976Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7408103Z test(self, **param_kwargs) 2022-11-23T02:58:19.7408466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7408573Z return func(*args, **kwargs) 2022-11-23T02:58:19.7408831Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7408944Z self.run_subtests( 2022-11-23T02:58:19.7409300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7409464Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7409840Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7409994Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7410379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7410481Z output = model(*input) 2022-11-23T02:58:19.7410813Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7410956Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7411336Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7411517Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7411888Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7412013Z _lazy_init(state, module) 2022-11-23T02:58:19.7412373Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7412498Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7412845Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7412970Z return func(*args, **kwargs) 2022-11-23T02:58:19.7413351Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7413452Z p_assert( 2022-11-23T02:58:19.7413793Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7413921Z traceback.print_stack() 2022-11-23T02:58:19.7414050Z File "", line 1, in 2022-11-23T02:58:19.7414245Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7414394Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7414596Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7414747Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7415010Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7415120Z self.run() 2022-11-23T02:58:19.7415323Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7415451Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7415802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7415937Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7416302Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7416475Z getattr(self, test_name)() 2022-11-23T02:58:19.7416845Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7416943Z fn() 2022-11-23T02:58:19.7417317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7417421Z test(self, **param_kwargs) 2022-11-23T02:58:19.7417784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7417908Z return func(*args, **kwargs) 2022-11-23T02:58:19.7418162Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7418272Z self.run_subtests( 2022-11-23T02:58:19.7418636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7418803Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7419176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7419312Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7419697Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7419818Z output = model(*input) 2022-11-23T02:58:19.7420146Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7420285Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7420667Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7420845Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7421217Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7421325Z _lazy_init(state, module) 2022-11-23T02:58:19.7421680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7421830Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7422176Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7422301Z return func(*args, **kwargs) 2022-11-23T02:58:19.7422683Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7422784Z p_assert( 2022-11-23T02:58:19.7423128Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7423236Z traceback.print_stack() 2022-11-23T02:58:19.7423478Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7423722Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7423851Z File "", line 1, in 2022-11-23T02:58:19.7423980Z File "", line 1, in 2022-11-23T02:58:19.7424302Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7424454Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7424659Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7424793Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7425006Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7425149Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7425365Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7425467Z self.run() 2022-11-23T02:58:19.7425723Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7425876Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7426064Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7426215Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7426432Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7426534Z self.run() 2022-11-23T02:58:19.7426887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7427019Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7427225Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7427371Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7427718Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7427846Z getattr(self, test_name)() 2022-11-23T02:58:19.7428189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7428319Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7428688Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7428788Z fn() 2022-11-23T02:58:19.7429376Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7429506Z getattr(self, test_name)() 2022-11-23T02:58:19.7429865Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7429993Z test(self, **param_kwargs) 2022-11-23T02:58:19.7430356Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7430459Z fn() 2022-11-23T02:58:19.7430813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7430939Z return func(*args, **kwargs) 2022-11-23T02:58:19.7431317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7431441Z test(self, **param_kwargs) 2022-11-23T02:58:19.7431679Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7431794Z self.run_subtests( 2022-11-23T02:58:19.7432159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7432285Z return func(*args, **kwargs) 2022-11-23T02:58:19.7432642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7432809Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7433063Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7433178Z self.run_subtests( 2022-11-23T02:58:19.7433617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7433785Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7434145Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7434310Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7434690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7434810Z output = model(*input) 2022-11-23T02:58:19.7435177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7435395Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7435710Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7435855Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7436240Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7436361Z output = model(*input) 2022-11-23T02:58:19.7436745Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7436923Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7437254Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7437397Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7437753Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7437878Z _lazy_init(state, module) 2022-11-23T02:58:19.7438258Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7438438Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7438797Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7438942Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7439316Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7439435Z _lazy_init(state, module) 2022-11-23T02:58:19.7439760Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7439886Z return func(*args, **kwargs) 2022-11-23T02:58:19.7440247Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7440390Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7440778Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7440881Z p_assert( 2022-11-23T02:58:19.7441224Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7441349Z return func(*args, **kwargs) 2022-11-23T02:58:19.7441670Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7441797Z traceback.print_stack() 2022-11-23T02:58:19.7442182Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7442283Z p_assert( 2022-11-23T02:58:19.7442628Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7442754Z traceback.print_stack() 2022-11-23T02:58:19.7442995Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7443284Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7443402Z File "", line 1, in 2022-11-23T02:58:19.7443620Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7443767Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7443974Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7444127Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7444344Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7444448Z self.run() 2022-11-23T02:58:19.7444683Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7444831Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7445182Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7445319Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7445686Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7445810Z getattr(self, test_name)() 2022-11-23T02:58:19.7446172Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7446269Z fn() 2022-11-23T02:58:19.7446621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7446743Z test(self, **param_kwargs) 2022-11-23T02:58:19.7447104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7447234Z return func(*args, **kwargs) 2022-11-23T02:58:19.7447492Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7447607Z self.run_subtests( 2022-11-23T02:58:19.7447966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7448130Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7448482Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7448638Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7449016Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7449136Z output = model(*input) 2022-11-23T02:58:19.7449469Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7449613Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7449996Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7450175Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7450579Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7450701Z _lazy_init(state, module) 2022-11-23T02:58:19.7451059Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7451202Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7451546Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7451675Z return func(*args, **kwargs) 2022-11-23T02:58:19.7451806Z File "", line 1, in 2022-11-23T02:58:19.7452193Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7452277Z p_assert( 2022-11-23T02:58:19.7452671Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7452803Z traceback.print_stack() 2022-11-23T02:58:19.7453018Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7453160Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7453365Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7453517Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7453734Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7453866Z self.run() 2022-11-23T02:58:19.7454075Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7454220Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7454567Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7454701Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7455067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7455192Z getattr(self, test_name)() 2022-11-23T02:58:19.7455536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7455635Z fn() 2022-11-23T02:58:19.7456005Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7456128Z test(self, **param_kwargs) 2022-11-23T02:58:19.7456489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7456615Z return func(*args, **kwargs) 2022-11-23T02:58:19.7456874Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7456991Z self.run_subtests( 2022-11-23T02:58:19.7457330Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7457494Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7457864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7458022Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7458404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7458527Z output = model(*input) 2022-11-23T02:58:19.7458855Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7458997Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7459366Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7459544Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7459919Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7460040Z _lazy_init(state, module) 2022-11-23T02:58:19.7460394Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7460538Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7460881Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7461009Z return func(*args, **kwargs) 2022-11-23T02:58:19.7461396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7461481Z p_assert( 2022-11-23T02:58:19.7461868Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7462002Z traceback.print_stack() 2022-11-23T02:58:19.7462244Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7462485Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7462617Z File "", line 1, in 2022-11-23T02:58:19.7462834Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7462959Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7463161Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7463364Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7463581Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7463685Z self.run() 2022-11-23T02:58:19.7463895Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7464044Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7464396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7464512Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7464884Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7465007Z getattr(self, test_name)() 2022-11-23T02:58:19.7465373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7465473Z fn() 2022-11-23T02:58:19.7465843Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7465967Z test(self, **param_kwargs) 2022-11-23T02:58:19.7466327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7466434Z return func(*args, **kwargs) 2022-11-23T02:58:19.7466693Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7466807Z self.run_subtests( 2022-11-23T02:58:19.7467166Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7467329Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7467696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7467854Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7468234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7468336Z output = model(*input) 2022-11-23T02:58:19.7468664Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7468805Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7469128Z File "", line 1, in 2022-11-23T02:58:19.7469534Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7469714Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7470088Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7470210Z _lazy_init(state, module) 2022-11-23T02:58:19.7470410Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7470552Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7470908Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7471122Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7471340Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7471495Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7471842Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7471969Z return func(*args, **kwargs) 2022-11-23T02:58:19.7472169Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7472273Z self.run() 2022-11-23T02:58:19.7472660Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7472824Z p_assert( 2022-11-23T02:58:19.7473030Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7473178Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7473528Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7473636Z traceback.print_stack() 2022-11-23T02:58:19.7473979Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7474112Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7474476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7474597Z getattr(self, test_name)() 2022-11-23T02:58:19.7474959Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7475061Z fn() 2022-11-23T02:58:19.7475431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7475534Z test(self, **param_kwargs) 2022-11-23T02:58:19.7475896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7476020Z return func(*args, **kwargs) 2022-11-23T02:58:19.7476280Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7476393Z self.run_subtests( 2022-11-23T02:58:19.7476750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7476915Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7477283Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7477423Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7477803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7477924Z output = model(*input) 2022-11-23T02:58:19.7478259Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7478399Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7478780Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7478956Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7479388Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7479493Z _lazy_init(state, module) 2022-11-23T02:58:19.7479847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7479997Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7480339Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7480510Z return func(*args, **kwargs) 2022-11-23T02:58:19.7480905Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7481007Z p_assert( 2022-11-23T02:58:19.7481351Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7481459Z traceback.print_stack() 2022-11-23T02:58:19.7481699Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7481939Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7482070Z File "", line 1, in 2022-11-23T02:58:19.7482249Z File "", line 1, in 2022-11-23T02:58:19.7482468Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7482612Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7482821Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7482955Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7483164Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7483309Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7483526Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7483630Z self.run() 2022-11-23T02:58:19.7483834Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7483990Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7484175Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7484326Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7484540Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7484642Z self.run() 2022-11-23T02:58:19.7484995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7485128Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7485332Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7485478Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7485825Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7485949Z getattr(self, test_name)() 2022-11-23T02:58:19.7486288Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7486422Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7486789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7486887Z fn() 2022-11-23T02:58:19.7487250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7487373Z getattr(self, test_name)() 2022-11-23T02:58:19.7487726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7487849Z test(self, **param_kwargs) 2022-11-23T02:58:19.7488212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7488309Z fn() 2022-11-23T02:58:19.7488668Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7488798Z return func(*args, **kwargs) 2022-11-23T02:58:19.7489171Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7489274Z test(self, **param_kwargs) 2022-11-23T02:58:19.7489578Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7489697Z self.run_subtests( 2022-11-23T02:58:19.7490062Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7490187Z return func(*args, **kwargs) 2022-11-23T02:58:19.7490550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7490714Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7490969Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7491108Z self.run_subtests( 2022-11-23T02:58:19.7491479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7491634Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7491993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7492156Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7492534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7492655Z output = model(*input) 2022-11-23T02:58:19.7493022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7493175Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7493487Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7493635Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7494013Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7494135Z output = model(*input) 2022-11-23T02:58:19.7494521Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7494700Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7495033Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7495175Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7495534Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7495656Z _lazy_init(state, module) 2022-11-23T02:58:19.7496039Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7496218Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7496578Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7496724Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7497095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7497217Z _lazy_init(state, module) 2022-11-23T02:58:19.7497540Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7497666Z return func(*args, **kwargs) 2022-11-23T02:58:19.7498021Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7498164Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7498557Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7498659Z p_assert( 2022-11-23T02:58:19.7499061Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7499193Z return func(*args, **kwargs) 2022-11-23T02:58:19.7499520Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7499646Z traceback.print_stack() 2022-11-23T02:58:19.7500027Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7500129Z p_assert( 2022-11-23T02:58:19.7500468Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7500596Z traceback.print_stack() 2022-11-23T02:58:19.7500887Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7501127Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7501239Z File "", line 1, in 2022-11-23T02:58:19.7501458Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7501602Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7501807Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7501959Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7502180Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7502283Z self.run() 2022-11-23T02:58:19.7502471Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7502619Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7502970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7503105Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7503472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7503602Z getattr(self, test_name)() 2022-11-23T02:58:19.7503967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7504065Z fn() 2022-11-23T02:58:19.7504419Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7504543Z test(self, **param_kwargs) 2022-11-23T02:58:19.7504904Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7505030Z return func(*args, **kwargs) 2022-11-23T02:58:19.7505284Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7505402Z self.run_subtests( 2022-11-23T02:58:19.7505761Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7505928Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7506041Z File "", line 1, in 2022-11-23T02:58:19.7506412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7506567Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7506947Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7507065Z output = model(*input) 2022-11-23T02:58:19.7507277Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7507424Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7507757Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7507880Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7508131Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7508290Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7508674Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7508850Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7509363Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7509474Z self.run() 2022-11-23T02:58:19.7509836Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7510039Z _lazy_init(state, module) 2022-11-23T02:58:19.7510251Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7510397Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7510764Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7510907Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7511249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7511382Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7511707Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7511833Z return func(*args, **kwargs) 2022-11-23T02:58:19.7512202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7512326Z getattr(self, test_name)() 2022-11-23T02:58:19.7512717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7512817Z p_assert( 2022-11-23T02:58:19.7513183Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7513281Z fn() 2022-11-23T02:58:19.7513605Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7513732Z traceback.print_stack() 2022-11-23T02:58:19.7514102Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7514226Z test(self, **param_kwargs) 2022-11-23T02:58:19.7514589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7514714Z return func(*args, **kwargs) 2022-11-23T02:58:19.7514974Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7515088Z self.run_subtests( 2022-11-23T02:58:19.7515427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7515596Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7515964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7516119Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7516499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7516618Z output = model(*input) 2022-11-23T02:58:19.7516946Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7517088Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7517456Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7517635Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7518069Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7518200Z _lazy_init(state, module) 2022-11-23T02:58:19.7518564Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7518707Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7519051Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7519175Z return func(*args, **kwargs) 2022-11-23T02:58:19.7519541Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7519693Z p_assert( 2022-11-23T02:58:19.7520036Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7520163Z traceback.print_stack() 2022-11-23T02:58:19.7520409Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7520648Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7520779Z File "", line 1, in 2022-11-23T02:58:19.7520907Z File "", line 1, in 2022-11-23T02:58:19.7521105Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7521280Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7521486Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7521638Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7521852Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7521994Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7522215Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7522301Z self.run() 2022-11-23T02:58:19.7522511Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7522662Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7522866Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7523012Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7523227Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7523330Z self.run() 2022-11-23T02:58:19.7523679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7523793Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7524002Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7524149Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7524520Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7524641Z getattr(self, test_name)() 2022-11-23T02:58:19.7524982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7525113Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7525478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7525558Z fn() 2022-11-23T02:58:19.7525925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7526046Z getattr(self, test_name)() 2022-11-23T02:58:19.7526421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7526541Z test(self, **param_kwargs) 2022-11-23T02:58:19.7526952Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7527055Z fn() 2022-11-23T02:58:19.7527420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7527529Z return func(*args, **kwargs) 2022-11-23T02:58:19.7527899Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7528020Z test(self, **param_kwargs) 2022-11-23T02:58:19.7528279Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7528393Z self.run_subtests( 2022-11-23T02:58:19.7528753Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7528926Z return func(*args, **kwargs) 2022-11-23T02:58:19.7529269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7529438Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7529695Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7529808Z self.run_subtests( 2022-11-23T02:58:19.7530176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7530331Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7530685Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7530851Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7531233Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7531335Z output = model(*input) 2022-11-23T02:58:19.7531708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7531861Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7532192Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7532332Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7532709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7532828Z output = model(*input) 2022-11-23T02:58:19.7533211Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7533376Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7533707Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7533849Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7534222Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7534342Z _lazy_init(state, module) 2022-11-23T02:58:19.7534721Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7534899Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7535257Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7535383Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7535755Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7535879Z _lazy_init(state, module) 2022-11-23T02:58:19.7536223Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7536392Z return func(*args, **kwargs) 2022-11-23T02:58:19.7536760Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7536903Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7537287Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7537371Z p_assert( 2022-11-23T02:58:19.7537714Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7537838Z return func(*args, **kwargs) 2022-11-23T02:58:19.7538178Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7538353Z traceback.print_stack() 2022-11-23T02:58:19.7538737Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7538841Z p_assert( 2022-11-23T02:58:19.7539182Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7539292Z traceback.print_stack() 2022-11-23T02:58:19.7539535Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7539777Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7539907Z File "", line 1, in 2022-11-23T02:58:19.7540120Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7540264Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7540473Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7540606Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7540823Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7540930Z self.run() 2022-11-23T02:58:19.7541133Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7541277Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7541629Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7541762Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7542127Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7542231Z getattr(self, test_name)() 2022-11-23T02:58:19.7542361Z File "", line 1, in 2022-11-23T02:58:19.7542730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7542827Z fn() 2022-11-23T02:58:19.7543196Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7543323Z test(self, **param_kwargs) 2022-11-23T02:58:19.7543533Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7543674Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7544019Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7544145Z return func(*args, **kwargs) 2022-11-23T02:58:19.7544349Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7544500Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7544752Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7544867Z self.run_subtests( 2022-11-23T02:58:19.7545086Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7545170Z self.run() 2022-11-23T02:58:19.7545577Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7545747Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7545956Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7546101Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7546476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7546632Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7546973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7547137Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7547524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7547644Z output = model(*input) 2022-11-23T02:58:19.7548013Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7548137Z getattr(self, test_name)() 2022-11-23T02:58:19.7548466Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7548607Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7549184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7549271Z fn() 2022-11-23T02:58:19.7549659Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7549846Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7550217Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7550375Z test(self, **param_kwargs) 2022-11-23T02:58:19.7550748Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7550871Z _lazy_init(state, module) 2022-11-23T02:58:19.7551232Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7551339Z return func(*args, **kwargs) 2022-11-23T02:58:19.7551696Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7551840Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7552095Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7552212Z self.run_subtests( 2022-11-23T02:58:19.7552558Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7552687Z return func(*args, **kwargs) 2022-11-23T02:58:19.7553046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7553191Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7553579Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7553682Z p_assert( 2022-11-23T02:58:19.7554052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7554203Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7554550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7554676Z traceback.print_stack() 2022-11-23T02:58:19.7555058Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7555234Z output = model(*input) 2022-11-23T02:58:19.7555580Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7555722Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7556104Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7556282Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7556653Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7556774Z _lazy_init(state, module) 2022-11-23T02:58:19.7557198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7557325Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7557673Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7557798Z return func(*args, **kwargs) 2022-11-23T02:58:19.7558187Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7558289Z p_assert( 2022-11-23T02:58:19.7558630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7558760Z traceback.print_stack() 2022-11-23T02:58:19.7559001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7559225Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7559361Z File "", line 1, in 2022-11-23T02:58:19.7559575Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7559717Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7559924Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7560076Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7560291Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7560394Z self.run() 2022-11-23T02:58:19.7560579Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7560727Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7561077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7561212Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7561582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7561707Z getattr(self, test_name)() 2022-11-23T02:58:19.7562071Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7562172Z fn() 2022-11-23T02:58:19.7562524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7562651Z test(self, **param_kwargs) 2022-11-23T02:58:19.7563014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7563139Z return func(*args, **kwargs) 2022-11-23T02:58:19.7563398Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7563514Z self.run_subtests( 2022-11-23T02:58:19.7563872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7564021Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7564464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7564628Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7565011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7565133Z output = model(*input) 2022-11-23T02:58:19.7565465Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7565609Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7565991Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7566167Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7566633Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7566757Z _lazy_init(state, module) 2022-11-23T02:58:19.7567120Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7567264Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7567608Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7567734Z return func(*args, **kwargs) 2022-11-23T02:58:19.7568119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7568221Z p_assert( 2022-11-23T02:58:19.7568544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7568675Z traceback.print_stack() 2022-11-23T02:58:19.7568805Z File "", line 1, in 2022-11-23T02:58:19.7569023Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7569166Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7569375Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7569531Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7569730Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7569833Z self.run() 2022-11-23T02:58:19.7570033Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7570179Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7570525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7570657Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7571030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7571155Z getattr(self, test_name)() 2022-11-23T02:58:19.7571503Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7571601Z fn() 2022-11-23T02:58:19.7571973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7572096Z test(self, **param_kwargs) 2022-11-23T02:58:19.7572459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7572584Z return func(*args, **kwargs) 2022-11-23T02:58:19.7572840Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7572955Z self.run_subtests( 2022-11-23T02:58:19.7573301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7573463Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7573887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7574047Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7574433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7574553Z output = model(*input) 2022-11-23T02:58:19.7574884Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7575026Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7575392Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7575617Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7575991Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7576114Z _lazy_init(state, module) 2022-11-23T02:58:19.7576473Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7576617Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7576961Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7577083Z return func(*args, **kwargs) 2022-11-23T02:58:19.7577448Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7577549Z p_assert( 2022-11-23T02:58:19.7577892Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7578022Z traceback.print_stack() 2022-11-23T02:58:19.7578265Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7578499Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7578629Z File "", line 1, in 2022-11-23T02:58:19.7578846Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7578972Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7579177Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7579329Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7579546Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7579650Z self.run() 2022-11-23T02:58:19.7579776Z File "", line 1, in 2022-11-23T02:58:19.7579982Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7580113Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7580459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7580591Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7580805Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7580948Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7581319Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7581441Z getattr(self, test_name)() 2022-11-23T02:58:19.7581647Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7581781Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7582148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7582249Z fn() 2022-11-23T02:58:19.7582459Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7582560Z self.run() 2022-11-23T02:58:19.7582982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7583113Z test(self, **param_kwargs) 2022-11-23T02:58:19.7583300Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7583447Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7583816Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7583942Z return func(*args, **kwargs) 2022-11-23T02:58:19.7584285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7584418Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7584723Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7584836Z self.run_subtests( 2022-11-23T02:58:19.7585192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7585317Z getattr(self, test_name)() 2022-11-23T02:58:19.7585676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7585839Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7586201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7586300Z fn() 2022-11-23T02:58:19.7586667Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7586819Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7587178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7587301Z test(self, **param_kwargs) 2022-11-23T02:58:19.7587685Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7587803Z output = model(*input) 2022-11-23T02:58:19.7588166Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7588290Z return func(*args, **kwargs) 2022-11-23T02:58:19.7588620Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7588760Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7589222Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7589345Z self.run_subtests( 2022-11-23T02:58:19.7589743Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7589924Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7590284Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7590449Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7590825Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7590948Z _lazy_init(state, module) 2022-11-23T02:58:19.7591298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7591451Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7591809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7591960Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7592340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7592462Z output = model(*input) 2022-11-23T02:58:19.7592878Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7593015Z return func(*args, **kwargs) 2022-11-23T02:58:19.7593331Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7593472Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7593854Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7593956Z p_assert( 2022-11-23T02:58:19.7594338Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7594576Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7594918Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7595049Z traceback.print_stack() 2022-11-23T02:58:19.7595422Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7595526Z _lazy_init(state, module) 2022-11-23T02:58:19.7595884Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7596029Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7596377Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7596503Z return func(*args, **kwargs) 2022-11-23T02:58:19.7596888Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7596994Z p_assert( 2022-11-23T02:58:19.7597321Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7597452Z traceback.print_stack() 2022-11-23T02:58:19.7597694Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7597932Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7598061Z File "", line 1, in 2022-11-23T02:58:19.7598275Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7598423Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7598629Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7598763Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7598897Z File "", line 1, in 2022-11-23T02:58:19.7599114Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7599218Z self.run() 2022-11-23T02:58:19.7599424Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7599574Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7599787Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7599929Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7600261Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7600396Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7600600Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7600753Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7601121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7601249Z getattr(self, test_name)() 2022-11-23T02:58:19.7601464Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7601548Z self.run() 2022-11-23T02:58:19.7601963Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7602068Z fn() 2022-11-23T02:58:19.7602272Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7602418Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7602791Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7602914Z test(self, **param_kwargs) 2022-11-23T02:58:19.7603258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7603418Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7603788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7603916Z return func(*args, **kwargs) 2022-11-23T02:58:19.7604281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7604405Z getattr(self, test_name)() 2022-11-23T02:58:19.7604659Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7604773Z self.run_subtests( 2022-11-23T02:58:19.7605140Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7605220Z fn() 2022-11-23T02:58:19.7605578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7605745Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7606117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7606240Z test(self, **param_kwargs) 2022-11-23T02:58:19.7606606Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7606761Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7607124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7607231Z return func(*args, **kwargs) 2022-11-23T02:58:19.7607613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7607734Z output = model(*input) 2022-11-23T02:58:19.7607991Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 214, in test_delayed_reduce_scatter 2022-11-23T02:58:19.7608109Z self.run_subtests( 2022-11-23T02:58:19.7608444Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7608588Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7608945Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7609091Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7609477Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7609660Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7610030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7610184Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7610557Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7610680Z _lazy_init(state, module) 2022-11-23T02:58:19.7611106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7611213Z output = model(*input) 2022-11-23T02:58:19.7611574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7611719Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7612049Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7612189Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7612535Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7612659Z return func(*args, **kwargs) 2022-11-23T02:58:19.7613096Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7613256Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7613644Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7613748Z p_assert( 2022-11-23T02:58:19.7614124Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7614248Z _lazy_init(state, module) 2022-11-23T02:58:19.7614587Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7614714Z traceback.print_stack() 2022-11-23T02:58:19.7615071Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7615197Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7615545Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7615670Z return func(*args, **kwargs) 2022-11-23T02:58:19.7616059Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7616161Z p_assert( 2022-11-23T02:58:19.7616503Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7616630Z traceback.print_stack() 2022-11-23T02:58:19.7616871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7617094Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7617332Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7617568Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7617802Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7618036Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7618270Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7618501Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7618732Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7618943Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7619179Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7619413Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7620492Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.7620739Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:58:19.7621755Z /opt/conda/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.7622033Z Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2022-11-23T02:58:19.7622270Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7622512Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7622748Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7622983Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7623214Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7623428Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7623659Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7623891Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7624213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7624480Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7624751Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7625020Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7625267Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7625700Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7625918Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7626189Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7626462Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7626733Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7627046Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7627320Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7627586Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7627852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7628066Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7628374Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7629436Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7630340Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7630622Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7630891Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7631159Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7631428Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7631768Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7632080Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7632295Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7632563Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7632831Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7633149Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7633414Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7633679Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.7633838Z dist init r=0, world=2 2022-11-23T02:58:19.7634213Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7634613Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7634939Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7635298Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7635647Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7636016Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7636376Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7636738Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7637105Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7637463Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7637847Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7638212Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.7638410Z dist init r=1, world=2 2022-11-23T02:58:19.7638760Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7639221Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7639577Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7639923Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7640316Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7640682Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7641078Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7641441Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7641802Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7642153Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7642502Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7642860Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.7642946Z ok (5.914s) 2022-11-23T02:58:19.7643196Z test_delayed_reduce_scatter_offload_true_none (__main__.TestParityWithDDP) 2022-11-23T02:58:19.7644168Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82399 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T02:58:19.7644542Z test_delayed_reduce_scatter_offload_true_shard_grad_op (__main__.TestParityWithDDP) 2022-11-23T02:58:19.7645487Z Tests the FSDP forward, backward, and optimizer step runtime by ... skip: Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/82403 for platform(s) linux, rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests. (0.001s) 2022-11-23T02:58:19.7645886Z test_mixture_of_experts_offload_false_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 86743 2022-11-23T02:58:19.7646147Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 86744 2022-11-23T02:58:19.7646585Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.7646806Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.7647234Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.7647527Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.7647896Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.7648153Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.7648575Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.7648805Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.7649090Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.7649377Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.7649890Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.7650420Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.7650684Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.7650902Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.7652009Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.7652156Z warnings.warn( 2022-11-23T02:58:19.7652426Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:58:19.7653499Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.7653656Z warnings.warn( 2022-11-23T02:58:19.7653939Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:58:19.7654382Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.7654824Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.7655147Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:58:19.7655428Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:58:19.7655814Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.7656250Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.7656539Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:58:19.7656818Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:58:19.7657258Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.7657720Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.7658557Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7659376Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7659698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:58:19.7659967Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:58:19.7660408Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.7660850Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.7661080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:58:19.7661356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:58:19.7661796Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.7662232Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.7662562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:58:19.7662844Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:58:19.7663287Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.7663726Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.7664057Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:58:19.7664282Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:58:19.7664714Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.7665151Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.7666240Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.7666521Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:58:19.7667601Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.7667842Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:58:19.7668182Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:58:19.7668474Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:58:19.7668915Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.7669621Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.7669908Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:58:19.7670218Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:58:19.7670714Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.7671161Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.7671443Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:58:19.7671765Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:58:19.7672214Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.7672646Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.7672933Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:58:19.7673212Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:58:19.7673607Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.7674084Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.7674882Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7675164Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:58:19.7675453Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:58:19.7675892Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.7676330Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.7676609Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:58:19.7676881Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:58:19.7677356Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.7678151Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7679065Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7679859Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7680248Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.7681026Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7681848Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7682644Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7682931Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:58:19.7683259Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:58:19.7683709Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.7684146Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.7684426Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:58:19.7684703Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:58:19.7685142Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.7685574Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.7686387Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7686620Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:58:19.7686933Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:58:19.7687430Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.7687863Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.7688144Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:58:19.7688425Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:58:19.7688910Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.7689365Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.7689643Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:58:19.7689952Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:58:19.7690342Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.7690773Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.7691111Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:58:19.7691388Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:58:19.7691834Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.7692268Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.7693065Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7693395Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:58:19.7693710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:58:19.7694158Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.7694542Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.7694829Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:58:19.7695114Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:58:19.7695549Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.7695983Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.7696268Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:58:19.7696541Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:58:19.7697015Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.7697448Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.7697676Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:58:19.7697962Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:58:19.7698395Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.7698829Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.7699668Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7700007Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:58:19.7700283Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:58:19.7700756Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.7701203Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.7701402Z dist init r=0, world=2 2022-11-23T02:58:19.7701496Z dist init r=1, world=2 2022-11-23T02:58:19.7701631Z ok (6.916s) 2022-11-23T02:58:19.7702027Z test_mixture_of_experts_offload_false_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87066 2022-11-23T02:58:19.7702287Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87067 2022-11-23T02:58:19.7702703Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.7702928Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.7703390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.7703620Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.7704035Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.7704195Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.7704615Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.7704842Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.7705176Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.7705469Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.7705913Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.7706380Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.7706657Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.7706872Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.7707951Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.7708102Z warnings.warn( 2022-11-23T02:58:19.7708380Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:58:19.7709771Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.7709943Z warnings.warn( 2022-11-23T02:58:19.7710230Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:58:19.7710727Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.7711165Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.7711449Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:58:19.7711728Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:58:19.7712302Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.7712691Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.7712972Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:58:19.7713249Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:58:19.7713685Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.7714154Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.7714955Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7715757Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7716044Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:58:19.7716324Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:58:19.7716757Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.7717196Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.7717427Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:58:19.7717710Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:58:19.7718180Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.7718628Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.7718912Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:58:19.7719240Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:58:19.7719673Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.7720111Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.7720956Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7721243Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:58:19.7721470Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:58:19.7721955Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.7722444Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.7723244Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7723527Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:58:19.7723965Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.7724242Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:58:19.7724689Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.7724977Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:58:19.7725290Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:58:19.7725737Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.7726123Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.7726960Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7727243Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:58:19.7727528Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:58:19.7727977Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.7728419Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.7729219Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7729534Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:58:19.7729814Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:58:19.7730264Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.7730758Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.7730994Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:58:19.7731282Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:58:19.7731722Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.7732154Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.7732942Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7733366Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:58:19.7733645Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:58:19.7734104Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.7734538Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.7735328Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7735611Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:58:19.7735836Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:58:19.7736273Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.7737059Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7737531Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.7737830Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:58:19.7738108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:58:19.7738554Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.7738990Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.7739773Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7740054Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:58:19.7740335Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:58:19.7740719Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.7741302Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.7741590Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:58:19.7741869Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:58:19.7742310Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.7742738Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.7743583Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7743878Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:58:19.7744152Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:58:19.7744630Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.7745065Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.7745298Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:58:19.7745578Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:58:19.7746017Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.7746451Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.7746742Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:58:19.7747062Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:58:19.7747499Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.7747965Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.7748250Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:58:19.7748475Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:58:19.7748918Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.7749607Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.7750447Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7751247Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7751611Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:58:19.7751902Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:58:19.7752392Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.7752833Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.7753112Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:58:19.7753398Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:58:19.7753868Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.7754307Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.7754646Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:58:19.7754923Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:58:19.7755360Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.7755828Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.7755989Z dist init r=1, world=2 2022-11-23T02:58:19.7756133Z dist init r=0, world=2 2022-11-23T02:58:19.7756221Z ok (7.116s) 2022-11-23T02:58:19.7756620Z test_mixture_of_experts_offload_false_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87389 2022-11-23T02:58:19.7756885Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87390 2022-11-23T02:58:19.7757302Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.7757517Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.7757944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.7758220Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.7758633Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.7758845Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.7759218Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.7759446Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.7759783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.7760071Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.7760511Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.7760959Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.7761273Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.7761542Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.7762676Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.7762833Z warnings.warn( 2022-11-23T02:58:19.7763908Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.7764124Z warnings.warn( 2022-11-23T02:58:19.7764356Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:58:19.7764641Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:58:19.7765086Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.7765558Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.7765840Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:58:19.7766118Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:58:19.7766606Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.7767056Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.7767341Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:58:19.7767567Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:58:19.7768002Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.7768436Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.7769270Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7770076Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7770361Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:58:19.7770653Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:58:19.7771096Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.7771529Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.7771820Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:58:19.7772103Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:58:19.7772536Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.7773019Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.7773298Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:58:19.7773635Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:58:19.7774073Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.7774502Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.7775355Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7775636Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:58:19.7775916Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:58:19.7776395Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.7776780Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.7777580Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7777864Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:58:19.7778140Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:58:19.7778572Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.7779003Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.7779348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:58:19.7779645Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:58:19.7780127Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.7780615Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.7781406Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7781639Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:58:19.7781916Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:58:19.7782360Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.7782792Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.7783640Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7783932Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:58:19.7784244Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:58:19.7784685Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.7785181Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.7785468Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:58:19.7785760Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:58:19.7786146Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.7786576Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.7787362Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7787693Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:58:19.7788005Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:58:19.7788445Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.7788881Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.7789929Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7790215Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:58:19.7790497Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:58:19.7790936Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.7791323Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.7792116Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7792441Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:58:19.7792722Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:58:19.7793175Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.7793685Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.7794492Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7794776Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:58:19.7795106Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:58:19.7795610Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.7796091Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.7796325Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:58:19.7796600Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:58:19.7797035Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.7797466Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.7798255Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7798547Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:58:19.7798833Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:58:19.7799277Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.7799749Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.7800034Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:58:19.7800257Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:58:19.7800706Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.7801138Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.7801420Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:58:19.7801745Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:58:19.7802191Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.7802626Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.7802939Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:58:19.7803221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:58:19.7803662Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.7804090Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.7804900Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7805678Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.7806018Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:58:19.7806291Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:58:19.7806732Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.7807200Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.7807483Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:58:19.7807759Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:58:19.7808204Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.7808690Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.7808917Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:58:19.7809195Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:58:19.7809633Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.7810065Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.7810246Z dist init r=0, world=2 2022-11-23T02:58:19.7810391Z dist init r=1, world=2 2022-11-23T02:58:19.7810536Z ok (6.816s) 2022-11-23T02:58:19.7810926Z test_mixture_of_experts_offload_true_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 87712 2022-11-23T02:58:19.7811140Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 87713 2022-11-23T02:58:19.7811563Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.7859267Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.7859779Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.7859963Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.7860341Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.7860511Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.7860886Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.7861072Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.7861312Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.7861703Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.7862129Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.7862521Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.7862738Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.7862960Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.7863989Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.7864165Z warnings.warn( 2022-11-23T02:58:19.7864403Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:58:19.7865414Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.7865517Z warnings.warn( 2022-11-23T02:58:19.7865751Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:58:19.7866145Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.7866266Z File "", line 1, in 2022-11-23T02:58:19.7866473Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7866600Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7866797Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7866937Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7867143Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7867235Z self.run() 2022-11-23T02:58:19.7867431Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7867571Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7867911Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7868029Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7868390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7868503Z getattr(self, test_name)() 2022-11-23T02:58:19.7868861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7869227Z fn() 2022-11-23T02:58:19.7869609Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7869723Z test(self, **param_kwargs) 2022-11-23T02:58:19.7870064Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7870185Z return func(*args, **kwargs) 2022-11-23T02:58:19.7870427Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.7870531Z self.run_subtests( 2022-11-23T02:58:19.7870961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7871124Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7871482Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7871625Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7871992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7872095Z output = model(*input) 2022-11-23T02:58:19.7872415Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7872612Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7872990Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7873161Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7873524Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7873634Z _lazy_init(state, module) 2022-11-23T02:58:19.7873978Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7874105Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7874438Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7874551Z return func(*args, **kwargs) 2022-11-23T02:58:19.7874931Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7875022Z p_assert( 2022-11-23T02:58:19.7875353Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7875470Z traceback.print_stack() 2022-11-23T02:58:19.7875866Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.7875979Z File "", line 1, in 2022-11-23T02:58:19.7876180Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7876311Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7876504Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7876644Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7876847Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7876943Z self.run() 2022-11-23T02:58:19.7877130Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7877265Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7877601Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7877723Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7878078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7878190Z getattr(self, test_name)() 2022-11-23T02:58:19.7878543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7878630Z fn() 2022-11-23T02:58:19.7878985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7879101Z test(self, **param_kwargs) 2022-11-23T02:58:19.7879454Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7879567Z return func(*args, **kwargs) 2022-11-23T02:58:19.7879857Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.7879965Z self.run_subtests( 2022-11-23T02:58:19.7880313Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7880464Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7880816Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7880960Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7881329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7881488Z output = model(*input) 2022-11-23T02:58:19.7881808Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7881939Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7882312Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7882481Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7882837Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7882947Z _lazy_init(state, module) 2022-11-23T02:58:19.7883293Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7883426Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7883764Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7883877Z return func(*args, **kwargs) 2022-11-23T02:58:19.7884250Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7884344Z p_assert( 2022-11-23T02:58:19.7884669Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7884785Z traceback.print_stack() 2022-11-23T02:58:19.7885022Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:58:19.7885259Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:58:19.7885652Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.7885770Z File "", line 1, in 2022-11-23T02:58:19.7885979Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7886112Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7886301Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7886444Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7886649Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7886741Z self.run() 2022-11-23T02:58:19.7886935Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7887070Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7887404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7887524Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7887873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7887989Z getattr(self, test_name)() 2022-11-23T02:58:19.7888343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7888429Z fn() 2022-11-23T02:58:19.7888837Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7888955Z test(self, **param_kwargs) 2022-11-23T02:58:19.7889307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7889420Z return func(*args, **kwargs) 2022-11-23T02:58:19.7889655Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.7889757Z self.run_subtests( 2022-11-23T02:58:19.7890101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7890333Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7890690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7890836Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7891202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7891311Z output = model(*input) 2022-11-23T02:58:19.7891625Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7891755Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7892126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7892292Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7892658Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7892768Z _lazy_init(state, module) 2022-11-23T02:58:19.7893111Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7893246Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7893571Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7893686Z return func(*args, **kwargs) 2022-11-23T02:58:19.7894057Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7894148Z p_assert( 2022-11-23T02:58:19.7894477Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7894592Z traceback.print_stack() 2022-11-23T02:58:19.7894986Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.7895107Z File "", line 1, in 2022-11-23T02:58:19.7895303Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7895436Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7895630Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7895770Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7895974Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7896066Z self.run() 2022-11-23T02:58:19.7896260Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7896389Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7896725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7896851Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7897207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7897319Z getattr(self, test_name)() 2022-11-23T02:58:19.7897720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7897815Z fn() 2022-11-23T02:58:19.7898177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7898281Z test(self, **param_kwargs) 2022-11-23T02:58:19.7898629Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7898743Z return func(*args, **kwargs) 2022-11-23T02:58:19.7898986Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.7899138Z self.run_subtests( 2022-11-23T02:58:19.7899487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7899638Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7899998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7900135Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7900504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7900613Z output = model(*input) 2022-11-23T02:58:19.7900931Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7901061Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7901431Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7901603Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7901968Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7902075Z _lazy_init(state, module) 2022-11-23T02:58:19.7902422Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7902554Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7902886Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7902999Z return func(*args, **kwargs) 2022-11-23T02:58:19.7903371Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7903462Z p_assert( 2022-11-23T02:58:19.7903792Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7903904Z traceback.print_stack() 2022-11-23T02:58:19.7904143Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:58:19.7904381Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:58:19.7904777Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.7905167Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.7905287Z File "", line 1, in 2022-11-23T02:58:19.7905489Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7905621Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7905810Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7905956Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7906161Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7906252Z self.run() 2022-11-23T02:58:19.7906494Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7906636Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7906976Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7907097Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7907446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7907558Z getattr(self, test_name)() 2022-11-23T02:58:19.7907910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7908104Z fn() 2022-11-23T02:58:19.7908465Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7908580Z test(self, **param_kwargs) 2022-11-23T02:58:19.7909162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7909296Z return func(*args, **kwargs) 2022-11-23T02:58:19.7909532Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.7909636Z self.run_subtests( 2022-11-23T02:58:19.7909987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7910138Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7910494Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7910640Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7911008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7911117Z output = model(*input) 2022-11-23T02:58:19.7911431Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7911562Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7911934Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7912100Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7912460Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7912569Z _lazy_init(state, module) 2022-11-23T02:58:19.7912915Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7913050Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7913376Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7913489Z return func(*args, **kwargs) 2022-11-23T02:58:19.7913864Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7913955Z p_assert( 2022-11-23T02:58:19.7914285Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7914400Z traceback.print_stack() 2022-11-23T02:58:19.7914518Z File "", line 1, in 2022-11-23T02:58:19.7914719Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7914844Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7915037Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7915180Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7915382Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7915473Z self.run() 2022-11-23T02:58:19.7915745Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7915890Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7916220Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7916342Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7916695Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7916806Z getattr(self, test_name)() 2022-11-23T02:58:19.7917157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7917307Z fn() 2022-11-23T02:58:19.7917670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7917782Z test(self, **param_kwargs) 2022-11-23T02:58:19.7918132Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7918246Z return func(*args, **kwargs) 2022-11-23T02:58:19.7918483Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.7918584Z self.run_subtests( 2022-11-23T02:58:19.7918931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7919081Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7919434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7919579Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7919943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7920051Z output = model(*input) 2022-11-23T02:58:19.7920371Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7920501Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7920870Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7921034Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7921394Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7921503Z _lazy_init(state, module) 2022-11-23T02:58:19.7921840Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7921977Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7922308Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7922426Z return func(*args, **kwargs) 2022-11-23T02:58:19.7922801Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7922893Z p_assert( 2022-11-23T02:58:19.7923221Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7923337Z traceback.print_stack() 2022-11-23T02:58:19.7923570Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:58:19.7923805Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:58:19.7924201Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.7924324Z File "", line 1, in 2022-11-23T02:58:19.7924528Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7924709Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7924909Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7925051Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7925250Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7925342Z self.run() 2022-11-23T02:58:19.7925538Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7925674Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7926013Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7926184Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7926539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7926650Z getattr(self, test_name)() 2022-11-23T02:58:19.7926998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7927085Z fn() 2022-11-23T02:58:19.7927441Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7927553Z test(self, **param_kwargs) 2022-11-23T02:58:19.7927900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7928014Z return func(*args, **kwargs) 2022-11-23T02:58:19.7928252Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.7928356Z self.run_subtests( 2022-11-23T02:58:19.7928697Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7928847Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7929204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7929348Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7929715Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7929824Z output = model(*input) 2022-11-23T02:58:19.7930141Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7930271Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7930637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7930808Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7931165Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7931276Z _lazy_init(state, module) 2022-11-23T02:58:19.7931621Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7931753Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7932081Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7932193Z return func(*args, **kwargs) 2022-11-23T02:58:19.7932558Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7932649Z p_assert( 2022-11-23T02:58:19.7932980Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7933099Z traceback.print_stack() 2022-11-23T02:58:19.7933493Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.7933658Z File "", line 1, in 2022-11-23T02:58:19.7933869Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7934000Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7934188Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7934328Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7934534Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7934625Z self.run() 2022-11-23T02:58:19.7934819Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7934956Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7935344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7935458Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7935820Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7935931Z getattr(self, test_name)() 2022-11-23T02:58:19.7936284Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7936371Z fn() 2022-11-23T02:58:19.7936731Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7936843Z test(self, **param_kwargs) 2022-11-23T02:58:19.7937193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7937299Z return func(*args, **kwargs) 2022-11-23T02:58:19.7937540Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.7937642Z self.run_subtests( 2022-11-23T02:58:19.7937992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7938144Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7938498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7938641Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7939010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7939111Z output = model(*input) 2022-11-23T02:58:19.7939429Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7939561Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7939931Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7940097Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7940457Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7940566Z _lazy_init(state, module) 2022-11-23T02:58:19.7940910Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7941036Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7941368Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7941481Z return func(*args, **kwargs) 2022-11-23T02:58:19.7941853Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7941947Z p_assert( 2022-11-23T02:58:19.7942276Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7942392Z traceback.print_stack() 2022-11-23T02:58:19.7942679Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:58:19.7942918Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:58:19.7943316Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.7943435Z File "", line 1, in 2022-11-23T02:58:19.7943639Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7943772Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7943967Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7944153Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7944359Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7944445Z self.run() 2022-11-23T02:58:19.7944642Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7944779Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7945112Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7945234Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7945587Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7945700Z getattr(self, test_name)() 2022-11-23T02:58:19.7946054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7946138Z fn() 2022-11-23T02:58:19.7946500Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7946611Z test(self, **param_kwargs) 2022-11-23T02:58:19.7946963Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7947077Z return func(*args, **kwargs) 2022-11-23T02:58:19.7947318Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.7947422Z self.run_subtests( 2022-11-23T02:58:19.7947769Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7947916Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7948269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7948416Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7948786Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7948896Z output = model(*input) 2022-11-23T02:58:19.7949459Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7949593Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7949976Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7950138Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7950546Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7950671Z _lazy_init(state, module) 2022-11-23T02:58:19.7951032Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7951180Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7951522Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7951721Z return func(*args, **kwargs) 2022-11-23T02:58:19.7952118Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7952203Z p_assert( 2022-11-23T02:58:19.7952543Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7952670Z traceback.print_stack() 2022-11-23T02:58:19.7953075Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.7953206Z File "", line 1, in 2022-11-23T02:58:19.7953420Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7953644Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7953850Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7953984Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7954206Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7954310Z self.run() 2022-11-23T02:58:19.7954516Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7954662Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7955012Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7955145Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7955492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7955618Z getattr(self, test_name)() 2022-11-23T02:58:19.7955988Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7956085Z fn() 2022-11-23T02:58:19.7956459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7956580Z test(self, **param_kwargs) 2022-11-23T02:58:19.7956940Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7957066Z return func(*args, **kwargs) 2022-11-23T02:58:19.7957300Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.7957415Z self.run_subtests( 2022-11-23T02:58:19.7957775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7957936Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7958305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7958459Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7958839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7958958Z output = model(*input) 2022-11-23T02:58:19.7959274Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7959415Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7959798Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7959973Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7960345Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7960469Z _lazy_init(state, module) 2022-11-23T02:58:19.7960825Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7960968Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7961342Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7961475Z return func(*args, **kwargs) 2022-11-23T02:58:19.7961865Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7961968Z p_assert( 2022-11-23T02:58:19.7962307Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7962436Z traceback.print_stack() 2022-11-23T02:58:19.7962687Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:58:19.7962983Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:58:19.7963373Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.7963507Z File "", line 1, in 2022-11-23T02:58:19.7963720Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7963866Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7964072Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7964224Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7964439Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7964543Z self.run() 2022-11-23T02:58:19.7964731Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7964878Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7965227Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7965360Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7965727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7965852Z getattr(self, test_name)() 2022-11-23T02:58:19.7966214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7966311Z fn() 2022-11-23T02:58:19.7966665Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7966788Z test(self, **param_kwargs) 2022-11-23T02:58:19.7967147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7967274Z return func(*args, **kwargs) 2022-11-23T02:58:19.7967526Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.7967639Z self.run_subtests( 2022-11-23T02:58:19.7968000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7968162Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7968512Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7968669Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7969049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7969168Z output = model(*input) 2022-11-23T02:58:19.7969499Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7969645Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7970028Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7970204Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7970609Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7970737Z _lazy_init(state, module) 2022-11-23T02:58:19.7971096Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7971243Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7971584Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7971710Z return func(*args, **kwargs) 2022-11-23T02:58:19.7972093Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7972245Z p_assert( 2022-11-23T02:58:19.7972569Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7972699Z traceback.print_stack() 2022-11-23T02:58:19.7973111Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.7973243Z File "", line 1, in 2022-11-23T02:58:19.7973457Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7973601Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7973805Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7973939Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7974154Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7974261Z self.run() 2022-11-23T02:58:19.7974466Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7974613Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7974960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7975094Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7975460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7975565Z getattr(self, test_name)() 2022-11-23T02:58:19.7975932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7976029Z fn() 2022-11-23T02:58:19.7976400Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7976522Z test(self, **param_kwargs) 2022-11-23T02:58:19.7976884Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7977011Z return func(*args, **kwargs) 2022-11-23T02:58:19.7977263Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.7977360Z self.run_subtests( 2022-11-23T02:58:19.7977718Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7977879Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7978248Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7978402Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7978780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7978901Z output = model(*input) 2022-11-23T02:58:19.7979234Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7979360Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7979786Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7979970Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7980349Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7980470Z _lazy_init(state, module) 2022-11-23T02:58:19.7980826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7980972Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7981315Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7981473Z return func(*args, **kwargs) 2022-11-23T02:58:19.7981860Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7981964Z p_assert( 2022-11-23T02:58:19.7982306Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7982432Z traceback.print_stack() 2022-11-23T02:58:19.7982680Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:58:19.7982928Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:58:19.7983332Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.7983716Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.7983851Z File "", line 1, in 2022-11-23T02:58:19.7984066Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7984209Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7984416Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7984568Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7984785Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7984888Z self.run() 2022-11-23T02:58:19.7985076Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7985221Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7985566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7985699Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7986067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7986196Z getattr(self, test_name)() 2022-11-23T02:58:19.7986562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7986661Z fn() 2022-11-23T02:58:19.7987016Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7987139Z test(self, **param_kwargs) 2022-11-23T02:58:19.7987499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7987624Z return func(*args, **kwargs) 2022-11-23T02:58:19.7987872Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.7987986Z self.run_subtests( 2022-11-23T02:58:19.7988345Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7988511Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7988864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7989353Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7989767Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7989889Z output = model(*input) 2022-11-23T02:58:19.7990216Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7990359Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.7990743Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.7990922Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.7991345Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.7991466Z _lazy_init(state, module) 2022-11-23T02:58:19.7991825Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.7991969Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.7992312Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.7992434Z return func(*args, **kwargs) 2022-11-23T02:58:19.7992816Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.7992919Z p_assert( 2022-11-23T02:58:19.7993242Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.7993372Z traceback.print_stack() 2022-11-23T02:58:19.7993501Z File "", line 1, in 2022-11-23T02:58:19.7993714Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.7993856Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.7994062Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.7994213Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.7994412Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.7994517Z self.run() 2022-11-23T02:58:19.7994721Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.7994866Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.7995212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.7995343Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.7995712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.7995836Z getattr(self, test_name)() 2022-11-23T02:58:19.7996186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.7996283Z fn() 2022-11-23T02:58:19.7996650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.7996771Z test(self, **param_kwargs) 2022-11-23T02:58:19.7997130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.7997255Z return func(*args, **kwargs) 2022-11-23T02:58:19.7997504Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.7997617Z self.run_subtests( 2022-11-23T02:58:19.7997956Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.7998122Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.7998536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.7998695Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.7999077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.7999197Z output = model(*input) 2022-11-23T02:58:19.7999522Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.7999663Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8000028Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8000208Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8000638Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8000759Z _lazy_init(state, module) 2022-11-23T02:58:19.8001124Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8001269Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8001612Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8001735Z return func(*args, **kwargs) 2022-11-23T02:58:19.8002100Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8002202Z p_assert( 2022-11-23T02:58:19.8002542Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8002672Z traceback.print_stack() 2022-11-23T02:58:19.8002919Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:58:19.8003168Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:58:19.8003576Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.8003706Z File "", line 1, in 2022-11-23T02:58:19.8003904Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8004045Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8004254Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8004405Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8004622Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8004730Z self.run() 2022-11-23T02:58:19.8004934Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8005080Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8005412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8005548Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8005913Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8006037Z getattr(self, test_name)() 2022-11-23T02:58:19.8006401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8006500Z fn() 2022-11-23T02:58:19.8006868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8006992Z test(self, **param_kwargs) 2022-11-23T02:58:19.8007343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8007472Z return func(*args, **kwargs) 2022-11-23T02:58:19.8007766Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8007887Z self.run_subtests( 2022-11-23T02:58:19.8008248Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8008411Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8008781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8008936Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8009299Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8009469Z output = model(*input) 2022-11-23T02:58:19.8009800Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8009940Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8010325Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8010502Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8010871Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8010992Z _lazy_init(state, module) 2022-11-23T02:58:19.8011331Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8011473Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8011814Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8011942Z return func(*args, **kwargs) 2022-11-23T02:58:19.8012325Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8012426Z p_assert( 2022-11-23T02:58:19.8012767Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8012894Z traceback.print_stack() 2022-11-23T02:58:19.8013282Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.8013413Z File "", line 1, in 2022-11-23T02:58:19.8013624Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8013766Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8013970Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8014119Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8014338Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8014424Z self.run() 2022-11-23T02:58:19.8014628Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8014778Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8015126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8015258Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8015624Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8015749Z getattr(self, test_name)() 2022-11-23T02:58:19.8016113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8016193Z fn() 2022-11-23T02:58:19.8016562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8016687Z test(self, **param_kwargs) 2022-11-23T02:58:19.8017048Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8017220Z return func(*args, **kwargs) 2022-11-23T02:58:19.8017474Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8017591Z self.run_subtests( 2022-11-23T02:58:19.8017951Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8018095Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8018461Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8018620Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8019069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8019189Z output = model(*input) 2022-11-23T02:58:19.8019519Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8019664Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8020048Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8020209Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8020577Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8020699Z _lazy_init(state, module) 2022-11-23T02:58:19.8021054Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8021199Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8021543Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8021666Z return func(*args, **kwargs) 2022-11-23T02:58:19.8022050Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8022135Z p_assert( 2022-11-23T02:58:19.8022475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8022600Z traceback.print_stack() 2022-11-23T02:58:19.8022849Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:58:19.8023093Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:58:19.8023499Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.8023636Z File "", line 1, in 2022-11-23T02:58:19.8023850Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8023975Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8024183Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8024334Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8024549Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8024651Z self.run() 2022-11-23T02:58:19.8024857Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8025005Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8025350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8025466Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8025838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8025960Z getattr(self, test_name)() 2022-11-23T02:58:19.8026370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8026480Z fn() 2022-11-23T02:58:19.8026852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8026976Z test(self, **param_kwargs) 2022-11-23T02:58:19.8027332Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8027441Z return func(*args, **kwargs) 2022-11-23T02:58:19.8027693Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8027806Z self.run_subtests( 2022-11-23T02:58:19.8028160Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8028374Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8028746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8028901Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8029538Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8029643Z output = model(*input) 2022-11-23T02:58:19.8029978Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8030120Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8030501Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8030677Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8031053Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8031174Z _lazy_init(state, module) 2022-11-23T02:58:19.8031533Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8031659Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8032001Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8032125Z return func(*args, **kwargs) 2022-11-23T02:58:19.8032510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8032611Z p_assert( 2022-11-23T02:58:19.8032951Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8033082Z traceback.print_stack() 2022-11-23T02:58:19.8033489Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.8033604Z File "", line 1, in 2022-11-23T02:58:19.8033820Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8033963Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8034166Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8034317Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8034533Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8034636Z self.run() 2022-11-23T02:58:19.8034825Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8034971Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8035317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8035454Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8035823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8036019Z getattr(self, test_name)() 2022-11-23T02:58:19.8036400Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8036499Z fn() 2022-11-23T02:58:19.8036848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8036973Z test(self, **param_kwargs) 2022-11-23T02:58:19.8037332Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8037458Z return func(*args, **kwargs) 2022-11-23T02:58:19.8037708Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8037890Z self.run_subtests( 2022-11-23T02:58:19.8038252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8038418Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8038768Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8038921Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8039300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8039419Z output = model(*input) 2022-11-23T02:58:19.8039748Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8039891Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8040279Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8040454Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8040811Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8040932Z _lazy_init(state, module) 2022-11-23T02:58:19.8041289Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8041430Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8041769Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8041894Z return func(*args, **kwargs) 2022-11-23T02:58:19.8042275Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8042379Z p_assert( 2022-11-23T02:58:19.8042702Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8042827Z traceback.print_stack() 2022-11-23T02:58:19.8043079Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:58:19.8043322Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:58:19.8043729Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.8043859Z File "", line 1, in 2022-11-23T02:58:19.8044072Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8044215Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8044403Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8044554Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8044773Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8044876Z self.run() 2022-11-23T02:58:19.8045080Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8045274Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8045632Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8045766Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8046116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8046240Z getattr(self, test_name)() 2022-11-23T02:58:19.8046604Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8046700Z fn() 2022-11-23T02:58:19.8047072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8047245Z test(self, **param_kwargs) 2022-11-23T02:58:19.8047609Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8047739Z return func(*args, **kwargs) 2022-11-23T02:58:19.8047972Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8048086Z self.run_subtests( 2022-11-23T02:58:19.8048442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8048604Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8048968Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8049124Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8049510Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8049630Z output = model(*input) 2022-11-23T02:58:19.8049947Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8050091Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8050518Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8050697Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8051072Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8051194Z _lazy_init(state, module) 2022-11-23T02:58:19.8051552Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8051698Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8052023Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8052149Z return func(*args, **kwargs) 2022-11-23T02:58:19.8052536Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8052638Z p_assert( 2022-11-23T02:58:19.8052979Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8053105Z traceback.print_stack() 2022-11-23T02:58:19.8053510Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.8053640Z File "", line 1, in 2022-11-23T02:58:19.8053836Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8053977Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8054185Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8054336Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8054551Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8054759Z self.run() 2022-11-23T02:58:19.8054975Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8055105Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8055451Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8055585Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8055950Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8056074Z getattr(self, test_name)() 2022-11-23T02:58:19.8056434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8056580Z fn() 2022-11-23T02:58:19.8056953Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8057058Z test(self, **param_kwargs) 2022-11-23T02:58:19.8057423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8057548Z return func(*args, **kwargs) 2022-11-23T02:58:19.8057797Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8057911Z self.run_subtests( 2022-11-23T02:58:19.8058269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8058431Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8058800Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8058939Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8059318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8059439Z output = model(*input) 2022-11-23T02:58:19.8059768Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8059910Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8060291Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8060467Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8060840Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8060943Z _lazy_init(state, module) 2022-11-23T02:58:19.8061304Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8061447Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8061791Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8061917Z return func(*args, **kwargs) 2022-11-23T02:58:19.8062299Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8062402Z p_assert( 2022-11-23T02:58:19.8062743Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8062851Z traceback.print_stack() 2022-11-23T02:58:19.8063100Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:58:19.8063342Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:58:19.8063754Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.8063884Z File "", line 1, in 2022-11-23T02:58:19.8064143Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8064293Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8064500Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8064634Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8064851Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8064957Z self.run() 2022-11-23T02:58:19.8065161Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8065307Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8065655Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8065838Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8066210Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8066320Z getattr(self, test_name)() 2022-11-23T02:58:19.8066685Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8066783Z fn() 2022-11-23T02:58:19.8067151Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8067274Z test(self, **param_kwargs) 2022-11-23T02:58:19.8067637Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8067762Z return func(*args, **kwargs) 2022-11-23T02:58:19.8068011Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8068110Z self.run_subtests( 2022-11-23T02:58:19.8068468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8068632Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8069247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8069414Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8069803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8069925Z output = model(*input) 2022-11-23T02:58:19.8070253Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8070378Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8070766Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8070945Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8071318Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8071439Z _lazy_init(state, module) 2022-11-23T02:58:19.8071797Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8071939Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8072282Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8072389Z return func(*args, **kwargs) 2022-11-23T02:58:19.8072773Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8072880Z p_assert( 2022-11-23T02:58:19.8073222Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8073350Z traceback.print_stack() 2022-11-23T02:58:19.8073825Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.8073965Z File "", line 1, in 2022-11-23T02:58:19.8074182Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8074307Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8074513Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8074666Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8074880Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8074984Z self.run() 2022-11-23T02:58:19.8075187Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8075399Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8075728Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8075864Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8076233Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8076357Z getattr(self, test_name)() 2022-11-23T02:58:19.8076718Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8076815Z fn() 2022-11-23T02:58:19.8077184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8077310Z test(self, **param_kwargs) 2022-11-23T02:58:19.8077653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8077782Z return func(*args, **kwargs) 2022-11-23T02:58:19.8078029Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8078142Z self.run_subtests( 2022-11-23T02:58:19.8078501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8078663Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8079027Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8079180Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8079600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8079721Z output = model(*input) 2022-11-23T02:58:19.8080051Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8080196Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8080578Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8080759Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8081131Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8081251Z _lazy_init(state, module) 2022-11-23T02:58:19.8081592Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8081735Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8082077Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8082201Z return func(*args, **kwargs) 2022-11-23T02:58:19.8082586Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8082688Z p_assert( 2022-11-23T02:58:19.8083081Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8083213Z traceback.print_stack() 2022-11-23T02:58:19.8083446Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:58:19.8083689Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:58:19.8084095Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.8084498Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.8084628Z File "", line 1, in 2022-11-23T02:58:19.8084907Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8085050Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8085260Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8085399Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8085617Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8085720Z self.run() 2022-11-23T02:58:19.8085925Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8086073Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8086421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8086556Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8086929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8087039Z getattr(self, test_name)() 2022-11-23T02:58:19.8087406Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8087505Z fn() 2022-11-23T02:58:19.8087876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8087999Z test(self, **param_kwargs) 2022-11-23T02:58:19.8088357Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8088483Z return func(*args, **kwargs) 2022-11-23T02:58:19.8088733Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8088831Z self.run_subtests( 2022-11-23T02:58:19.8089185Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8089352Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8089722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8089876Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8090260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8090382Z output = model(*input) 2022-11-23T02:58:19.8090711Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8090834Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8091219Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8091401Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8091772Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8091896Z _lazy_init(state, module) 2022-11-23T02:58:19.8092253Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8092442Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8092794Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8092901Z return func(*args, **kwargs) 2022-11-23T02:58:19.8093287Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8093390Z p_assert( 2022-11-23T02:58:19.8093729Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8093857Z traceback.print_stack() 2022-11-23T02:58:19.8093987Z File "", line 1, in 2022-11-23T02:58:19.8094251Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8094396Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8094585Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8094742Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8094958Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8095063Z self.run() 2022-11-23T02:58:19.8095269Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8095415Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8095761Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8095878Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8096246Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8096373Z getattr(self, test_name)() 2022-11-23T02:58:19.8096738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8096836Z fn() 2022-11-23T02:58:19.8097206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8097329Z test(self, **param_kwargs) 2022-11-23T02:58:19.8097688Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8097796Z return func(*args, **kwargs) 2022-11-23T02:58:19.8098047Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8098160Z self.run_subtests( 2022-11-23T02:58:19.8098516Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8098683Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8099047Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8099202Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8099580Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8099682Z output = model(*input) 2022-11-23T02:58:19.8100015Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8100156Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8100539Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8100714Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8101093Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8101213Z _lazy_init(state, module) 2022-11-23T02:58:19.8101616Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8101746Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8102091Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8102216Z return func(*args, **kwargs) 2022-11-23T02:58:19.8102600Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8102702Z p_assert( 2022-11-23T02:58:19.8103042Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8103169Z traceback.print_stack() 2022-11-23T02:58:19.8103421Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:58:19.8103699Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:58:19.8104113Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.8104519Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.8104771Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:58:19.8105171Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.8105420Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:58:19.8105827Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.8106072Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:58:19.8106311Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:58:19.8106710Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.8107092Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.8107336Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:58:19.8107573Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:58:19.8107969Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.8108372Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.8109391Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8110164Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8110411Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:58:19.8110814Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.8111061Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:58:19.8111535Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.8111768Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:58:19.8112007Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:58:19.8112411Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.8112813Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.8113057Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:58:19.8113518Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.8113767Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:58:19.8114168Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.8114410Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:58:19.8114787Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.8115034Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:58:19.8115435Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.8115682Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:58:19.8116082Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.8116326Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:58:19.8116722Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.8116962Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:58:19.8117199Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:58:19.8117597Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.8117979Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.8118220Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:58:19.8118618Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.8118864Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:58:19.8119257Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.8119497Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:58:19.8119730Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:58:19.8120126Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.8120521Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.8120789Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T02:58:19.8121032Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T02:58:19.8121429Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:58:19.8121822Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:58:19.8122063Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T02:58:19.8122301Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T02:58:19.8122742Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:58:19.8123139Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:58:19.8123379Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T02:58:19.8123596Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T02:58:19.8123991Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:58:19.8124385Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:58:19.8124622Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T02:58:19.8124859Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T02:58:19.8125256Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:58:19.8125652Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:58:19.8125890Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T02:58:19.8126124Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T02:58:19.8126517Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:58:19.8126891Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:58:19.8127140Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T02:58:19.8127541Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:58:19.8127779Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T02:58:19.8128173Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:58:19.8128417Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T02:58:19.8128655Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T02:58:19.8129048Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:58:19.8129444Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:58:19.8129669Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T02:58:19.8129956Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T02:58:19.8130363Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:58:19.8130757Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:58:19.8131001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T02:58:19.8131241Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T02:58:19.8131685Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:58:19.8132078Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:58:19.8132323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T02:58:19.8132702Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:58:19.8132942Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T02:58:19.8133336Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:58:19.8133576Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T02:58:19.8133810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T02:58:19.8134211Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:58:19.8134606Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:58:19.8134843Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T02:58:19.8135077Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T02:58:19.8135469Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:58:19.8135845Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:58:19.8135958Z dist init r=1, world=2 2022-11-23T02:58:19.8136301Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8136629Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8136944Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8137252Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8137558Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8137863Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8138172Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8138524Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8138839Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8139129Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8139435Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8139789Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8139904Z dist init r=0, world=2 2022-11-23T02:58:19.8140241Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8140564Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8140880Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8141194Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8141508Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8141820Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8142128Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8142420Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8142726Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8143041Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8143354Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8143662Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8143764Z ok (7.117s) 2022-11-23T02:58:19.8144116Z test_mixture_of_experts_offload_true_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88059 2022-11-23T02:58:19.8144338Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88060 2022-11-23T02:58:19.8144726Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.8144911Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.8145283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.8145529Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.8145915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.8146091Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.8146479Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.8146672Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.8146925Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.8147177Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.8147645Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.8148033Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.8148271Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.8148503Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.8149780Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.8149904Z warnings.warn( 2022-11-23T02:58:19.8150152Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:58:19.8151208Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.8151322Z warnings.warn( 2022-11-23T02:58:19.8151565Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:58:19.8151971Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.8152107Z File "", line 1, in 2022-11-23T02:58:19.8152308Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8152452Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8152664Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8152816Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8153033Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8153137Z self.run() 2022-11-23T02:58:19.8153343Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8153472Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8153824Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8153964Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8154336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8154460Z getattr(self, test_name)() 2022-11-23T02:58:19.8154900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8155008Z fn() 2022-11-23T02:58:19.8155394Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8155500Z test(self, **param_kwargs) 2022-11-23T02:58:19.8155864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8155992Z return func(*args, **kwargs) 2022-11-23T02:58:19.8156242Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8156357Z self.run_subtests( 2022-11-23T02:58:19.8156781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8156947Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8157319Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8157454Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8157837Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8157956Z output = model(*input) 2022-11-23T02:58:19.8158285Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8158425Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8158808Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8158989Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8159360Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8159481Z _lazy_init(state, module) 2022-11-23T02:58:19.8159824Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8159966Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8160307Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8160432Z return func(*args, **kwargs) 2022-11-23T02:58:19.8160819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8160921Z p_assert( 2022-11-23T02:58:19.8161259Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8161373Z traceback.print_stack() 2022-11-23T02:58:19.8161780Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.8161910Z File "", line 1, in 2022-11-23T02:58:19.8162124Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8162266Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8162472Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8162623Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8162840Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8162926Z self.run() 2022-11-23T02:58:19.8163131Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8163277Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8163629Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8163761Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8164176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8164306Z getattr(self, test_name)() 2022-11-23T02:58:19.8164675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8164755Z fn() 2022-11-23T02:58:19.8165125Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8165247Z test(self, **param_kwargs) 2022-11-23T02:58:19.8165606Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8165731Z return func(*args, **kwargs) 2022-11-23T02:58:19.8166046Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8166162Z self.run_subtests( 2022-11-23T02:58:19.8166520Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8166669Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8167035Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8167189Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8167568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8167688Z output = model(*input) 2022-11-23T02:58:19.8168017Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8168161Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8168550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8168710Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8169086Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8169209Z _lazy_init(state, module) 2022-11-23T02:58:19.8169564Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8169707Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8170047Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8170174Z return func(*args, **kwargs) 2022-11-23T02:58:19.8170561Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8170649Z p_assert( 2022-11-23T02:58:19.8170991Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8171117Z traceback.print_stack() 2022-11-23T02:58:19.8171367Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:58:19.8171615Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:58:19.8172018Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.8172147Z File "", line 1, in 2022-11-23T02:58:19.8172360Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8172486Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8172692Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8172847Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8173063Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8173165Z self.run() 2022-11-23T02:58:19.8173419Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8173576Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8173925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8174041Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8174409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8174533Z getattr(self, test_name)() 2022-11-23T02:58:19.8174896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8174993Z fn() 2022-11-23T02:58:19.8175417Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8175541Z test(self, **param_kwargs) 2022-11-23T02:58:19.8175888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8176013Z return func(*args, **kwargs) 2022-11-23T02:58:19.8176260Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8176373Z self.run_subtests( 2022-11-23T02:58:19.8176728Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8176890Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8177257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8177414Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8177779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8177900Z output = model(*input) 2022-11-23T02:58:19.8178233Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8178375Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8178758Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8178934Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8179305Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8179426Z _lazy_init(state, module) 2022-11-23T02:58:19.8179780Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8179911Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8180255Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8180379Z return func(*args, **kwargs) 2022-11-23T02:58:19.8180766Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8180868Z p_assert( 2022-11-23T02:58:19.8181205Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8181333Z traceback.print_stack() 2022-11-23T02:58:19.8181721Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.8181851Z File "", line 1, in 2022-11-23T02:58:19.8182063Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8182209Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8182415Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8182566Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8182834Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8182944Z self.run() 2022-11-23T02:58:19.8183131Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8183280Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8183629Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8183763Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8184130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8184253Z getattr(self, test_name)() 2022-11-23T02:58:19.8184678Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8184777Z fn() 2022-11-23T02:58:19.8185130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8185256Z test(self, **param_kwargs) 2022-11-23T02:58:19.8185615Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8185739Z return func(*args, **kwargs) 2022-11-23T02:58:19.8185991Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8186104Z self.run_subtests( 2022-11-23T02:58:19.8186458Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8186620Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8186974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8187129Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8187508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8187627Z output = model(*input) 2022-11-23T02:58:19.8187957Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8188097Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8188479Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8188657Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8189249Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8189387Z _lazy_init(state, module) 2022-11-23T02:58:19.8189750Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8189895Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8190239Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8190363Z return func(*args, **kwargs) 2022-11-23T02:58:19.8190746Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8190848Z p_assert( 2022-11-23T02:58:19.8191170Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8191295Z traceback.print_stack() 2022-11-23T02:58:19.8191543Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:58:19.8191789Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:58:19.8192201Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.8192332Z File "", line 1, in 2022-11-23T02:58:19.8192619Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8192773Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8192962Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8193116Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8193335Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8193439Z self.run() 2022-11-23T02:58:19.8193641Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8193788Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8194258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8194393Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8194747Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8194872Z getattr(self, test_name)() 2022-11-23T02:58:19.8195238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8195336Z fn() 2022-11-23T02:58:19.8195704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8195827Z test(self, **param_kwargs) 2022-11-23T02:58:19.8196186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8196295Z return func(*args, **kwargs) 2022-11-23T02:58:19.8196551Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8196666Z self.run_subtests( 2022-11-23T02:58:19.8197025Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8197190Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8197558Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8197710Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8198087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8198189Z output = model(*input) 2022-11-23T02:58:19.8198521Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8198662Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8199052Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8199227Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8199600Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8199721Z _lazy_init(state, module) 2022-11-23T02:58:19.8200075Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8200218Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8200542Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8200667Z return func(*args, **kwargs) 2022-11-23T02:58:19.8201047Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8201152Z p_assert( 2022-11-23T02:58:19.8201492Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8201619Z traceback.print_stack() 2022-11-23T02:58:19.8202072Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.8202190Z File "", line 1, in 2022-11-23T02:58:19.8202407Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8202551Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8202759Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8202910Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8203128Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8203231Z self.run() 2022-11-23T02:58:19.8203486Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8203616Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8203964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8204100Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8204469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8204593Z getattr(self, test_name)() 2022-11-23T02:58:19.8204957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8205055Z fn() 2022-11-23T02:58:19.8205423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8205531Z test(self, **param_kwargs) 2022-11-23T02:58:19.8205893Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8206019Z return func(*args, **kwargs) 2022-11-23T02:58:19.8206271Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8206387Z self.run_subtests( 2022-11-23T02:58:19.8206745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8206909Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8207275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8207412Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8207791Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8207911Z output = model(*input) 2022-11-23T02:58:19.8208245Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8208384Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8208770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8208949Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8209320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8209423Z _lazy_init(state, module) 2022-11-23T02:58:19.8209780Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8209922Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8210265Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8210394Z return func(*args, **kwargs) 2022-11-23T02:58:19.8210777Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8210879Z p_assert( 2022-11-23T02:58:19.8211269Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8211384Z traceback.print_stack() 2022-11-23T02:58:19.8211632Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:58:19.8211883Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:58:19.8212290Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.8212419Z File "", line 1, in 2022-11-23T02:58:19.8212633Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8212842Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8213050Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8213185Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8213406Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8213511Z self.run() 2022-11-23T02:58:19.8213716Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8213861Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8214208Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8214342Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8214708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8214815Z getattr(self, test_name)() 2022-11-23T02:58:19.8215184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8215281Z fn() 2022-11-23T02:58:19.8215649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8215775Z test(self, **param_kwargs) 2022-11-23T02:58:19.8216134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8216260Z return func(*args, **kwargs) 2022-11-23T02:58:19.8216494Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8216609Z self.run_subtests( 2022-11-23T02:58:19.8216967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8217135Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8217504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8217656Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8218039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8218159Z output = model(*input) 2022-11-23T02:58:19.8218472Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8218615Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8218997Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8219175Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8219549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8219673Z _lazy_init(state, module) 2022-11-23T02:58:19.8220031Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8220173Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8220564Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8220677Z return func(*args, **kwargs) 2022-11-23T02:58:19.8221066Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8221169Z p_assert( 2022-11-23T02:58:19.8221509Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8221638Z traceback.print_stack() 2022-11-23T02:58:19.8222041Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.8222222Z File "", line 1, in 2022-11-23T02:58:19.8222418Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8222566Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8222776Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8222929Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8223147Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8223251Z self.run() 2022-11-23T02:58:19.8223454Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8223601Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8223934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8224069Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8224433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8224559Z getattr(self, test_name)() 2022-11-23T02:58:19.8224925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8225025Z fn() 2022-11-23T02:58:19.8225393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8225516Z test(self, **param_kwargs) 2022-11-23T02:58:19.8225862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8225989Z return func(*args, **kwargs) 2022-11-23T02:58:19.8226239Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8226351Z self.run_subtests( 2022-11-23T02:58:19.8226710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8226877Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8227243Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8227400Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8227764Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8227882Z output = model(*input) 2022-11-23T02:58:19.8228211Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8228352Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8228732Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8228909Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8229538Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8229662Z _lazy_init(state, module) 2022-11-23T02:58:19.8230072Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8230226Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8230575Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8230701Z return func(*args, **kwargs) 2022-11-23T02:58:19.8231089Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8231192Z p_assert( 2022-11-23T02:58:19.8231533Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8231660Z traceback.print_stack() 2022-11-23T02:58:19.8231958Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:58:19.8232207Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:58:19.8232622Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.8232753Z File "", line 1, in 2022-11-23T02:58:19.8232969Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8233114Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8233321Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8233472Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8233671Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8233777Z self.run() 2022-11-23T02:58:19.8233989Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8234135Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8234481Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8234618Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8234989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8235113Z getattr(self, test_name)() 2022-11-23T02:58:19.8235459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8235557Z fn() 2022-11-23T02:58:19.8235927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8236051Z test(self, **param_kwargs) 2022-11-23T02:58:19.8236411Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8236541Z return func(*args, **kwargs) 2022-11-23T02:58:19.8236788Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8236887Z self.run_subtests( 2022-11-23T02:58:19.8237248Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8237412Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8237780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8237934Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8238317Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8238436Z output = model(*input) 2022-11-23T02:58:19.8238770Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8238895Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8239328Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8239513Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8239890Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8240012Z _lazy_init(state, module) 2022-11-23T02:58:19.8240368Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8240512Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8240852Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8241027Z return func(*args, **kwargs) 2022-11-23T02:58:19.8241396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8241500Z p_assert( 2022-11-23T02:58:19.8241841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8241970Z traceback.print_stack() 2022-11-23T02:58:19.8242379Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.8242509Z File "", line 1, in 2022-11-23T02:58:19.8242724Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8242849Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8243055Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8243206Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8243425Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8243529Z self.run() 2022-11-23T02:58:19.8243733Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8243884Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8244232Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8244348Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8244722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8244845Z getattr(self, test_name)() 2022-11-23T02:58:19.8245212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8245310Z fn() 2022-11-23T02:58:19.8245852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8246253Z test(self, **param_kwargs) 2022-11-23T02:58:19.8246783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8247191Z return func(*args, **kwargs) 2022-11-23T02:58:19.8247588Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8247976Z self.run_subtests( 2022-11-23T02:58:19.8248485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8248922Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8249460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8249897Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8250517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8250932Z output = model(*input) 2022-11-23T02:58:19.8251399Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8251852Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8252422Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8252882Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8253461Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8253859Z _lazy_init(state, module) 2022-11-23T02:58:19.8254382Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8254779Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8255360Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8255752Z return func(*args, **kwargs) 2022-11-23T02:58:19.8256282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8256672Z p_assert( 2022-11-23T02:58:19.8257350Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8257748Z traceback.print_stack() 2022-11-23T02:58:19.8258146Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:58:19.8258670Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:58:19.8259349Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.8259798Z File "", line 1, in 2022-11-23T02:58:19.8260164Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8260551Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8260939Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8261334Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8261722Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8262061Z self.run() 2022-11-23T02:58:19.8262398Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8262757Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8263281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8263678Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8264198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8264601Z getattr(self, test_name)() 2022-11-23T02:58:19.8265128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8265503Z fn() 2022-11-23T02:58:19.8265989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8266388Z test(self, **param_kwargs) 2022-11-23T02:58:19.8266912Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8267295Z return func(*args, **kwargs) 2022-11-23T02:58:19.8267698Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8268074Z self.run_subtests( 2022-11-23T02:58:19.8268582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8269254Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8269831Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8270341Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8270906Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8271328Z output = model(*input) 2022-11-23T02:58:19.8271820Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8272212Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8272757Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8273234Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8273891Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8274293Z _lazy_init(state, module) 2022-11-23T02:58:19.8274792Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8275207Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8275727Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8276112Z return func(*args, **kwargs) 2022-11-23T02:58:19.8276637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8277027Z p_assert( 2022-11-23T02:58:19.8277499Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8277868Z traceback.print_stack() 2022-11-23T02:58:19.8278438Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.8278872Z File "", line 1, in 2022-11-23T02:58:19.8279236Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8279611Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8279991Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8280363Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8280742Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8281082Z self.run() 2022-11-23T02:58:19.8281418Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8281775Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8282319Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8284878Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8285458Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8285868Z getattr(self, test_name)() 2022-11-23T02:58:19.8286383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8286758Z fn() 2022-11-23T02:58:19.8287257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8287642Z test(self, **param_kwargs) 2022-11-23T02:58:19.8288162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8288554Z return func(*args, **kwargs) 2022-11-23T02:58:19.8288947Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8289332Z self.run_subtests( 2022-11-23T02:58:19.8289841Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8290298Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8290951Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8291406Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8292004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8292406Z output = model(*input) 2022-11-23T02:58:19.8292875Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8293269Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8293851Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8294406Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8294980Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8295389Z _lazy_init(state, module) 2022-11-23T02:58:19.8295907Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8296302Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8296830Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8297217Z return func(*args, **kwargs) 2022-11-23T02:58:19.8297764Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8298140Z p_assert( 2022-11-23T02:58:19.8298621Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8299011Z traceback.print_stack() 2022-11-23T02:58:19.8299403Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:58:19.8299915Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:58:19.8300591Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.8301023Z File "", line 1, in 2022-11-23T02:58:19.8301385Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8301761Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8302139Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8302499Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8302895Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8303233Z self.run() 2022-11-23T02:58:19.8303551Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8303926Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8304450Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8304842Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8305356Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8305758Z getattr(self, test_name)() 2022-11-23T02:58:19.8306279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8306634Z fn() 2022-11-23T02:58:19.8307127Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8307528Z test(self, **param_kwargs) 2022-11-23T02:58:19.8308044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8308422Z return func(*args, **kwargs) 2022-11-23T02:58:19.8308885Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8309612Z self.run_subtests( 2022-11-23T02:58:19.8310120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8310552Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8311110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8311542Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8312088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8312595Z output = model(*input) 2022-11-23T02:58:19.8313082Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8313460Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8314016Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8314481Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8315056Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8315438Z _lazy_init(state, module) 2022-11-23T02:58:19.8315950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8316365Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8316874Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8317256Z return func(*args, **kwargs) 2022-11-23T02:58:19.8317802Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8318173Z p_assert( 2022-11-23T02:58:19.8318648Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8319036Z traceback.print_stack() 2022-11-23T02:58:19.8319602Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.8320018Z File "", line 1, in 2022-11-23T02:58:19.8320395Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8320771Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8321137Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8321514Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8321908Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8322244Z self.run() 2022-11-23T02:58:19.8322570Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8322939Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8323463Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8323844Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8324375Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8324772Z getattr(self, test_name)() 2022-11-23T02:58:19.8325274Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8325654Z fn() 2022-11-23T02:58:19.8326148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8326548Z test(self, **param_kwargs) 2022-11-23T02:58:19.8327117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8327531Z return func(*args, **kwargs) 2022-11-23T02:58:19.8327934Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8328295Z self.run_subtests( 2022-11-23T02:58:19.8328799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8329231Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8329791Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8330265Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8330830Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8331233Z output = model(*input) 2022-11-23T02:58:19.8331700Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8332093Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8332650Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8333113Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8333667Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8334069Z _lazy_init(state, module) 2022-11-23T02:58:19.8334587Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8334983Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8335505Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8335898Z return func(*args, **kwargs) 2022-11-23T02:58:19.8336421Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8336810Z p_assert( 2022-11-23T02:58:19.8337279Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8337664Z traceback.print_stack() 2022-11-23T02:58:19.8338054Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:58:19.8338562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:58:19.8339239Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.8339653Z File "", line 1, in 2022-11-23T02:58:19.8340038Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8340414Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8340789Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8341146Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8341541Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8341883Z self.run() 2022-11-23T02:58:19.8342205Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8342573Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8343098Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8343482Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8344015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8344412Z getattr(self, test_name)() 2022-11-23T02:58:19.8344984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8345348Z fn() 2022-11-23T02:58:19.8345845Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8346251Z test(self, **param_kwargs) 2022-11-23T02:58:19.8346750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8347146Z return func(*args, **kwargs) 2022-11-23T02:58:19.8347555Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8347996Z self.run_subtests( 2022-11-23T02:58:19.8348487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8348920Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8349725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8350178Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8350742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8351145Z output = model(*input) 2022-11-23T02:58:19.8351631Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8352010Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8352560Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8353033Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8353598Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8354003Z _lazy_init(state, module) 2022-11-23T02:58:19.8354514Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8354927Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8355432Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8355823Z return func(*args, **kwargs) 2022-11-23T02:58:19.8356365Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8356735Z p_assert( 2022-11-23T02:58:19.8357249Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8357634Z traceback.print_stack() 2022-11-23T02:58:19.8358181Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.8358620Z File "", line 1, in 2022-11-23T02:58:19.8359260Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8359780Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8360145Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8360523Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8360917Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8361240Z self.run() 2022-11-23T02:58:19.8361579Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8361959Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8362493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8362868Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8363499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8363909Z getattr(self, test_name)() 2022-11-23T02:58:19.8364417Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8364786Z fn() 2022-11-23T02:58:19.8365279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8365663Z test(self, **param_kwargs) 2022-11-23T02:58:19.8366184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8366651Z return func(*args, **kwargs) 2022-11-23T02:58:19.8367056Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8367412Z self.run_subtests( 2022-11-23T02:58:19.8367921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8368350Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8368888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8369313Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8369871Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8370273Z output = model(*input) 2022-11-23T02:58:19.8370740Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8371139Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8371697Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8372147Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8372722Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8373119Z _lazy_init(state, module) 2022-11-23T02:58:19.8373631Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8374026Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8374545Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8374929Z return func(*args, **kwargs) 2022-11-23T02:58:19.8375451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8375841Z p_assert( 2022-11-23T02:58:19.8376315Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8376704Z traceback.print_stack() 2022-11-23T02:58:19.8377097Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:58:19.8377606Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:58:19.8378276Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.8378692Z File "", line 1, in 2022-11-23T02:58:19.8379068Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8379446Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8379810Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8380191Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8380583Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8380919Z self.run() 2022-11-23T02:58:19.8381293Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8381674Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8382196Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8382571Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8383101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8383499Z getattr(self, test_name)() 2022-11-23T02:58:19.8384011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8384425Z fn() 2022-11-23T02:58:19.8384922Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8385323Z test(self, **param_kwargs) 2022-11-23T02:58:19.8385827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8386226Z return func(*args, **kwargs) 2022-11-23T02:58:19.8386627Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8386985Z self.run_subtests( 2022-11-23T02:58:19.8387488Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8387919Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8388472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8388886Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8389935Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8390394Z output = model(*input) 2022-11-23T02:58:19.8390880Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8391271Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8391825Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8392292Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8392847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8393244Z _lazy_init(state, module) 2022-11-23T02:58:19.8393757Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8394149Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8394669Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8395058Z return func(*args, **kwargs) 2022-11-23T02:58:19.8395599Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8395967Z p_assert( 2022-11-23T02:58:19.8396441Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8396826Z traceback.print_stack() 2022-11-23T02:58:19.8397374Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.8397800Z File "", line 1, in 2022-11-23T02:58:19.8398182Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8398557Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8398916Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8399393Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8399800Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8400123Z self.run() 2022-11-23T02:58:19.8400457Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8400825Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8401333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8401728Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8402259Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8402727Z getattr(self, test_name)() 2022-11-23T02:58:19.8403233Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8403600Z fn() 2022-11-23T02:58:19.8404099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8404480Z test(self, **param_kwargs) 2022-11-23T02:58:19.8404996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8405388Z return func(*args, **kwargs) 2022-11-23T02:58:19.8405772Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8406149Z self.run_subtests( 2022-11-23T02:58:19.8406650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8407082Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8407620Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8408044Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8408611Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8408993Z output = model(*input) 2022-11-23T02:58:19.8409473Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8409859Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8410413Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8410860Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8411430Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8411833Z _lazy_init(state, module) 2022-11-23T02:58:19.8412325Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8412742Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8413260Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8413644Z return func(*args, **kwargs) 2022-11-23T02:58:19.8414168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8414554Z p_assert( 2022-11-23T02:58:19.8415032Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8415400Z traceback.print_stack() 2022-11-23T02:58:19.8415808Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:58:19.8416314Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:58:19.8416981Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.8417491Z File "", line 1, in 2022-11-23T02:58:19.8417876Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8418256Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8418616Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8418993Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8419385Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8419721Z self.run() 2022-11-23T02:58:19.8420042Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8420481Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8421005Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8421382Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8421917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8422316Z getattr(self, test_name)() 2022-11-23T02:58:19.8422820Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8423190Z fn() 2022-11-23T02:58:19.8423684Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8424083Z test(self, **param_kwargs) 2022-11-23T02:58:19.8424580Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8424983Z return func(*args, **kwargs) 2022-11-23T02:58:19.8425385Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8425741Z self.run_subtests( 2022-11-23T02:58:19.8426246Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8426676Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8427232Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8427641Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8428195Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8428598Z output = model(*input) 2022-11-23T02:58:19.8429328Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8429740Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8430301Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8430766Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8431322Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8431724Z _lazy_init(state, module) 2022-11-23T02:58:19.8432235Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8432633Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8433148Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8433529Z return func(*args, **kwargs) 2022-11-23T02:58:19.8434053Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8434445Z p_assert( 2022-11-23T02:58:19.8434917Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8435380Z traceback.print_stack() 2022-11-23T02:58:19.8435945Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.8436376Z File "", line 1, in 2022-11-23T02:58:19.8436751Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8437114Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8437488Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8437864Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8438259Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8438647Z self.run() 2022-11-23T02:58:19.8438982Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8439352Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8439863Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8440256Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8440784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8441162Z getattr(self, test_name)() 2022-11-23T02:58:19.8441682Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8442054Z fn() 2022-11-23T02:58:19.8442545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8442932Z test(self, **param_kwargs) 2022-11-23T02:58:19.8443451Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8443846Z return func(*args, **kwargs) 2022-11-23T02:58:19.8444238Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8444612Z self.run_subtests( 2022-11-23T02:58:19.8445116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8445544Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8446083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8446507Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8447068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8447455Z output = model(*input) 2022-11-23T02:58:19.8447938Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8448328Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8448885Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8449334Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8449948Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8450353Z _lazy_init(state, module) 2022-11-23T02:58:19.8450849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8451259Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8451779Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8452172Z return func(*args, **kwargs) 2022-11-23T02:58:19.8452696Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8453083Z p_assert( 2022-11-23T02:58:19.8453625Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8454002Z traceback.print_stack() 2022-11-23T02:58:19.8454407Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:58:19.8454918Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:58:19.8455568Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.8456002Z File "", line 1, in 2022-11-23T02:58:19.8456435Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8456811Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8457168Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8457545Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8457940Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8458262Z self.run() 2022-11-23T02:58:19.8458596Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8458965Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8459491Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8459869Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8460401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8460805Z getattr(self, test_name)() 2022-11-23T02:58:19.8461309Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8461680Z fn() 2022-11-23T02:58:19.8462177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8462563Z test(self, **param_kwargs) 2022-11-23T02:58:19.8463082Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8463475Z return func(*args, **kwargs) 2022-11-23T02:58:19.8463878Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8464237Z self.run_subtests( 2022-11-23T02:58:19.8464737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8465171Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8465709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8466134Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8466696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8467099Z output = model(*input) 2022-11-23T02:58:19.8467568Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8467958Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8468514Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8469185Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8469774Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8470178Z _lazy_init(state, module) 2022-11-23T02:58:19.8470692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8471157Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8471689Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8472075Z return func(*args, **kwargs) 2022-11-23T02:58:19.8472601Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8472986Z p_assert( 2022-11-23T02:58:19.8473458Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8473843Z traceback.print_stack() 2022-11-23T02:58:19.8474391Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.8474589Z File "", line 1, in 2022-11-23T02:58:19.8474808Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8474953Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8475161Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8475316Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8475516Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8475622Z self.run() 2022-11-23T02:58:19.8475826Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8475972Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8476322Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8476457Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8476828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8476950Z getattr(self, test_name)() 2022-11-23T02:58:19.8477305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8477404Z fn() 2022-11-23T02:58:19.8477774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8477896Z test(self, **param_kwargs) 2022-11-23T02:58:19.8478257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8478383Z return func(*args, **kwargs) 2022-11-23T02:58:19.8478634Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8478747Z self.run_subtests( 2022-11-23T02:58:19.8479094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8479257Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8479624Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8479776Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8480155Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8480274Z output = model(*input) 2022-11-23T02:58:19.8480688Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8480940Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8481439Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8481630Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8482006Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8482128Z _lazy_init(state, module) 2022-11-23T02:58:19.8482550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8482700Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8483046Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8483170Z return func(*args, **kwargs) 2022-11-23T02:58:19.8483535Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8483640Z p_assert( 2022-11-23T02:58:19.8483980Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8484159Z traceback.print_stack() 2022-11-23T02:58:19.8484415Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:58:19.8484661Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:58:19.8485071Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.8485203Z File "", line 1, in 2022-11-23T02:58:19.8485400Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8485543Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8485751Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8485901Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8486117Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8486227Z self.run() 2022-11-23T02:58:19.8486432Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8486579Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8486911Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8487045Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8487417Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8487541Z getattr(self, test_name)() 2022-11-23T02:58:19.8487908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8488005Z fn() 2022-11-23T02:58:19.8488378Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8488487Z test(self, **param_kwargs) 2022-11-23T02:58:19.8488847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8488976Z return func(*args, **kwargs) 2022-11-23T02:58:19.8489230Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8489351Z self.run_subtests( 2022-11-23T02:58:19.8489708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8489870Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8490240Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8490390Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8490752Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8490877Z output = model(*input) 2022-11-23T02:58:19.8491207Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8491352Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8491783Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8491968Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8492347Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8492469Z _lazy_init(state, module) 2022-11-23T02:58:19.8492805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8492950Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8493293Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8493466Z return func(*args, **kwargs) 2022-11-23T02:58:19.8493856Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8493958Z p_assert( 2022-11-23T02:58:19.8494302Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8494430Z traceback.print_stack() 2022-11-23T02:58:19.8494821Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.8494954Z File "", line 1, in 2022-11-23T02:58:19.8495171Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8495314Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8495520Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8495678Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8495896Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8495981Z self.run() 2022-11-23T02:58:19.8496189Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8496338Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8496683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8496817Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8497185Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8497308Z getattr(self, test_name)() 2022-11-23T02:58:19.8497675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8497755Z fn() 2022-11-23T02:58:19.8498129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8498255Z test(self, **param_kwargs) 2022-11-23T02:58:19.8498619Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8498745Z return func(*args, **kwargs) 2022-11-23T02:58:19.8498997Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8499112Z self.run_subtests( 2022-11-23T02:58:19.8499473Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8499618Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8499983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8500138Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8500521Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8500642Z output = model(*input) 2022-11-23T02:58:19.8501022Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8501170Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8501559Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8501737Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8502091Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8502215Z _lazy_init(state, module) 2022-11-23T02:58:19.8502574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8502769Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8503117Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8503242Z return func(*args, **kwargs) 2022-11-23T02:58:19.8503628Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8503731Z p_assert( 2022-11-23T02:58:19.8504059Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8504189Z traceback.print_stack() 2022-11-23T02:58:19.8504440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:58:19.8504688Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:58:19.8505094Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.8505506Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.8505756Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:58:19.8505998Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:58:19.8506384Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.8506786Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.8507032Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:58:19.8507270Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:58:19.8507672Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.8508075Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.8508319Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:58:19.8508555Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:58:19.8509164Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.8509579Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.8510327Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8510659Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:58:19.8510911Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:58:19.8511315Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.8511714Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.8512477Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8513313Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8513563Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:58:19.8513805Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:58:19.8514209Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.8514606Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.8514857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:58:19.8515081Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:58:19.8515492Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.8515894Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.8516134Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:58:19.8516374Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:58:19.8516775Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.8517172Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.8517425Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:58:19.8517666Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:58:19.8518064Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.8518443Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.8518684Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:58:19.8518922Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:58:19.8519321Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.8519720Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.8520012Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:58:19.8520257Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:58:19.8520661Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.8521054Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.8521818Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8522752Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8522998Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:58:19.8523222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:58:19.8523626Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.8524023Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.8524272Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T02:58:19.8524513Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T02:58:19.8524917Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:58:19.8525317Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:58:19.8525562Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T02:58:19.8525805Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T02:58:19.8526187Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:58:19.8526581Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:58:19.8526831Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T02:58:19.8527070Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T02:58:19.8527471Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:58:19.8527868Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:58:19.8528112Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T02:58:19.8528355Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T02:58:19.8528752Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:58:19.8529155Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:58:19.8529425Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T02:58:19.8529671Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T02:58:19.8530073Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:58:19.8530469Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:58:19.8531236Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8532049Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8532296Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T02:58:19.8532532Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T02:58:19.8532931Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:58:19.8533333Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:58:19.8533581Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T02:58:19.8533803Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T02:58:19.8534211Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:58:19.8534971Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8535371Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:58:19.8536113Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8536366Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T02:58:19.8536608Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T02:58:19.8537011Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:58:19.8537410Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:58:19.8537661Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T02:58:19.8537902Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T02:58:19.8538288Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:58:19.8538691Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:58:19.8538986Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T02:58:19.8539234Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T02:58:19.8539632Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:58:19.8540027Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:58:19.8540272Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T02:58:19.8540558Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T02:58:19.8540962Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:58:19.8541361Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:58:19.8542101Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8542857Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8543108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T02:58:19.8543355Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T02:58:19.8543761Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:58:19.8544161Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:58:19.8544274Z dist init r=1, world=2 2022-11-23T02:58:19.8544616Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8544942Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8545260Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8545573Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8545884Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8546174Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8546482Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8546796Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8547148Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8547458Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8547766Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8548074Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8548186Z dist init r=0, world=2 2022-11-23T02:58:19.8548568Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8548893Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8549435Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8549753Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8550077Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8550387Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8550703Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8551017Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8551327Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8551638Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8551947Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8552262Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8552365Z ok (7.617s) 2022-11-23T02:58:19.8552734Z test_mixture_of_experts_offload_true_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88406 2022-11-23T02:58:19.8552961Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88407 2022-11-23T02:58:19.8553335Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.8553523Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.8553915Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.8554115Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.8554490Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.8554668Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.8555121Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.8555322Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.8555575Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.8555806Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.8556217Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.8556622Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.8556915Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.8557154Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.8558203Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.8558318Z warnings.warn( 2022-11-23T02:58:19.8558566Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:58:19.8559596Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.8559709Z warnings.warn( 2022-11-23T02:58:19.8559953Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:58:19.8560359Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.8560741Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.8560873Z File "", line 1, in 2022-11-23T02:58:19.8561092Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8561236Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8561445Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8561601Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8561823Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8561928Z self.run() 2022-11-23T02:58:19.8562116Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8562267Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8562617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8562749Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8563122Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8563251Z getattr(self, test_name)() 2022-11-23T02:58:19.8563621Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8563723Z fn() 2022-11-23T02:58:19.8564124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8564255Z test(self, **param_kwargs) 2022-11-23T02:58:19.8564623Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8564750Z return func(*args, **kwargs) 2022-11-23T02:58:19.8565005Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8565119Z self.run_subtests( 2022-11-23T02:58:19.8565478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8565670Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8566039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8566197Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8566584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8566710Z output = model(*input) 2022-11-23T02:58:19.8567044Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8567188Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8567573Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8567752Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8568108Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8568238Z _lazy_init(state, module) 2022-11-23T02:58:19.8568600Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8568748Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8569098Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8569227Z return func(*args, **kwargs) 2022-11-23T02:58:19.8569360Z File "", line 1, in 2022-11-23T02:58:19.8569753Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8569839Z p_assert( 2022-11-23T02:58:19.8570187Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8570315Z traceback.print_stack() 2022-11-23T02:58:19.8570536Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8570681Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8570885Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8571038Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8571237Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8571342Z self.run() 2022-11-23T02:58:19.8571547Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8571696Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8572039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8572179Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8572549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8572675Z getattr(self, test_name)() 2022-11-23T02:58:19.8573020Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8573121Z fn() 2022-11-23T02:58:19.8573537Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8573670Z test(self, **param_kwargs) 2022-11-23T02:58:19.8574034Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8574161Z return func(*args, **kwargs) 2022-11-23T02:58:19.8574420Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8574535Z self.run_subtests( 2022-11-23T02:58:19.8574876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8575104Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8575472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8575628Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8576010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8576131Z output = model(*input) 2022-11-23T02:58:19.8576464Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8576606Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8576973Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8577155Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8577536Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8577660Z _lazy_init(state, module) 2022-11-23T02:58:19.8578018Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8578170Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8578514Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8578637Z return func(*args, **kwargs) 2022-11-23T02:58:19.8579022Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8579107Z p_assert( 2022-11-23T02:58:19.8579451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8579580Z traceback.print_stack() 2022-11-23T02:58:19.8579828Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:58:19.8580081Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:58:19.8580494Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.8580629Z File "", line 1, in 2022-11-23T02:58:19.8580844Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8580971Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8581178Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8581332Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8581548Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8581652Z self.run() 2022-11-23T02:58:19.8581860Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8582010Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8582358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8582474Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8582890Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8583022Z getattr(self, test_name)() 2022-11-23T02:58:19.8583392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8583492Z fn() 2022-11-23T02:58:19.8583917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8584045Z test(self, **param_kwargs) 2022-11-23T02:58:19.8584409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8584620Z return func(*args, **kwargs) 2022-11-23T02:58:19.8584876Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8584994Z self.run_subtests( 2022-11-23T02:58:19.8585362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8585531Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8585900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8586053Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8586437Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8586540Z output = model(*input) 2022-11-23T02:58:19.8586875Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8587020Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8587412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8587596Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8587972Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8588097Z _lazy_init(state, module) 2022-11-23T02:58:19.8588457Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8588584Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8589137Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8589278Z return func(*args, **kwargs) 2022-11-23T02:58:19.8589679Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8589784Z p_assert( 2022-11-23T02:58:19.8590126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8590254Z traceback.print_stack() 2022-11-23T02:58:19.8590660Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.8590772Z File "", line 1, in 2022-11-23T02:58:19.8590992Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8591138Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8591344Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8591498Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8591717Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8591827Z self.run() 2022-11-23T02:58:19.8592037Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8592167Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8592584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8592730Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8593098Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8593227Z getattr(self, test_name)() 2022-11-23T02:58:19.8593596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8593694Z fn() 2022-11-23T02:58:19.8594070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8594238Z test(self, **param_kwargs) 2022-11-23T02:58:19.8594599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8594728Z return func(*args, **kwargs) 2022-11-23T02:58:19.8594982Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8595101Z self.run_subtests( 2022-11-23T02:58:19.8595457Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8595618Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8595985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8596121Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8596501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8596627Z output = model(*input) 2022-11-23T02:58:19.8596954Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8597097Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8619222Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8619472Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8619905Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8620036Z _lazy_init(state, module) 2022-11-23T02:58:19.8620387Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8620532Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8620893Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8621022Z return func(*args, **kwargs) 2022-11-23T02:58:19.8621410Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8621517Z p_assert( 2022-11-23T02:58:19.8621868Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8621995Z traceback.print_stack() 2022-11-23T02:58:19.8622227Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:58:19.8622476Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:58:19.8622890Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.8623039Z File "", line 1, in 2022-11-23T02:58:19.8623265Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8623410Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8623617Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8623880Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8624093Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8624198Z self.run() 2022-11-23T02:58:19.8624404Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8624558Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8624907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8625046Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8625416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8625598Z getattr(self, test_name)() 2022-11-23T02:58:19.8625949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8626047Z fn() 2022-11-23T02:58:19.8626428Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8626553Z test(self, **param_kwargs) 2022-11-23T02:58:19.8626922Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8627047Z return func(*args, **kwargs) 2022-11-23T02:58:19.8627299Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8627396Z self.run_subtests( 2022-11-23T02:58:19.8627761Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8627932Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8628299Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8628458Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8628844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8629182Z output = model(*input) 2022-11-23T02:58:19.8629535Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8629684Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8630052Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8630231Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8630611Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8630740Z _lazy_init(state, module) 2022-11-23T02:58:19.8631097Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8631244Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8631596Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8631720Z return func(*args, **kwargs) 2022-11-23T02:58:19.8632085Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8632192Z p_assert( 2022-11-23T02:58:19.8632534Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8632661Z traceback.print_stack() 2022-11-23T02:58:19.8633065Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.8633203Z File "", line 1, in 2022-11-23T02:58:19.8633419Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8633653Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8633854Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8634012Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8634228Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8634332Z self.run() 2022-11-23T02:58:19.8634544Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8634691Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8635040Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8635218Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8635590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8635718Z getattr(self, test_name)() 2022-11-23T02:58:19.8636090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8636188Z fn() 2022-11-23T02:58:19.8636564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8636686Z test(self, **param_kwargs) 2022-11-23T02:58:19.8637051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8637159Z return func(*args, **kwargs) 2022-11-23T02:58:19.8637410Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8637528Z self.run_subtests( 2022-11-23T02:58:19.8637891Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8638056Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8638430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8638585Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8638970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8639072Z output = model(*input) 2022-11-23T02:58:19.8639403Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8639548Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8639930Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8640109Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8640490Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8640614Z _lazy_init(state, module) 2022-11-23T02:58:19.8640969Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8641116Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8641440Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8641565Z return func(*args, **kwargs) 2022-11-23T02:58:19.8641956Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8642059Z p_assert( 2022-11-23T02:58:19.8642399Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8642529Z traceback.print_stack() 2022-11-23T02:58:19.8642783Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:58:19.8643078Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:58:19.8643478Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.8643614Z File "", line 1, in 2022-11-23T02:58:19.8643828Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8643971Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8644182Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8644334Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8644549Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8644685Z self.run() 2022-11-23T02:58:19.8644899Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8645047Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8645402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8645536Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8645902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8646029Z getattr(self, test_name)() 2022-11-23T02:58:19.8646390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8646471Z fn() 2022-11-23T02:58:19.8646838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8646968Z test(self, **param_kwargs) 2022-11-23T02:58:19.8647325Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8647449Z return func(*args, **kwargs) 2022-11-23T02:58:19.8647703Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8647820Z self.run_subtests( 2022-11-23T02:58:19.8648177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8648322Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8648690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8648845Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8649228Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8649353Z output = model(*input) 2022-11-23T02:58:19.8649681Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8649827Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8650261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8650440Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8650798Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8650923Z _lazy_init(state, module) 2022-11-23T02:58:19.8651278Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8651427Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8651771Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8651896Z return func(*args, **kwargs) 2022-11-23T02:58:19.8652282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8652434Z p_assert( 2022-11-23T02:58:19.8652764Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8652897Z traceback.print_stack() 2022-11-23T02:58:19.8653307Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.8653440Z File "", line 1, in 2022-11-23T02:58:19.8653652Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8653794Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8653999Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8654184Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8654404Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8654507Z self.run() 2022-11-23T02:58:19.8654714Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8654863Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8655216Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8655350Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8655722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8655827Z getattr(self, test_name)() 2022-11-23T02:58:19.8656194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8656299Z fn() 2022-11-23T02:58:19.8656673Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8656800Z test(self, **param_kwargs) 2022-11-23T02:58:19.8657167Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8657294Z return func(*args, **kwargs) 2022-11-23T02:58:19.8657548Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8657643Z self.run_subtests( 2022-11-23T02:58:19.8658004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8658173Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8658540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8658698Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8659083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8659204Z output = model(*input) 2022-11-23T02:58:19.8659537Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8659661Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8660050Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8660231Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8660604Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8660727Z _lazy_init(state, module) 2022-11-23T02:58:19.8661085Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8661232Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8661576Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8661733Z return func(*args, **kwargs) 2022-11-23T02:58:19.8662129Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8662233Z p_assert( 2022-11-23T02:58:19.8662574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8662700Z traceback.print_stack() 2022-11-23T02:58:19.8662949Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:58:19.8663199Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:58:19.8663606Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.8663799Z File "", line 1, in 2022-11-23T02:58:19.8664022Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8664169Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8664375Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8664530Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8664747Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8664851Z self.run() 2022-11-23T02:58:19.8665057Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8665186Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8665535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8665673Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8666040Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8666166Z getattr(self, test_name)() 2022-11-23T02:58:19.8666537Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8666636Z fn() 2022-11-23T02:58:19.8667008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8667114Z test(self, **param_kwargs) 2022-11-23T02:58:19.8667474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8667598Z return func(*args, **kwargs) 2022-11-23T02:58:19.8667849Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8667967Z self.run_subtests( 2022-11-23T02:58:19.8668325Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8668489Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8668861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8669349Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8669750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8669871Z output = model(*input) 2022-11-23T02:58:19.8670201Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8670344Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8670728Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8670910Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8671282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8671461Z _lazy_init(state, module) 2022-11-23T02:58:19.8671838Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8671982Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8672326Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8672453Z return func(*args, **kwargs) 2022-11-23T02:58:19.8672837Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8672939Z p_assert( 2022-11-23T02:58:19.8673281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8673455Z traceback.print_stack() 2022-11-23T02:58:19.8673866Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.8674000Z File "", line 1, in 2022-11-23T02:58:19.8674217Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8674362Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8674569Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8674724Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8674942Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8675029Z self.run() 2022-11-23T02:58:19.8675232Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8675379Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8675728Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8675862Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8676233Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8676356Z getattr(self, test_name)() 2022-11-23T02:58:19.8676703Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8676803Z fn() 2022-11-23T02:58:19.8677172Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8677295Z test(self, **param_kwargs) 2022-11-23T02:58:19.8677656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8677786Z return func(*args, **kwargs) 2022-11-23T02:58:19.8678041Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8678154Z self.run_subtests( 2022-11-23T02:58:19.8678498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8678661Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8679028Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8679183Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8679562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8679683Z output = model(*input) 2022-11-23T02:58:19.8680013Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8680159Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8680521Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8680699Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8681123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8681250Z _lazy_init(state, module) 2022-11-23T02:58:19.8681618Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8681763Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8682106Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8682235Z return func(*args, **kwargs) 2022-11-23T02:58:19.8682603Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8682761Z p_assert( 2022-11-23T02:58:19.8683112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8683240Z traceback.print_stack() 2022-11-23T02:58:19.8683492Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:58:19.8683753Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:58:19.8684161Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.8684292Z File "", line 1, in 2022-11-23T02:58:19.8684505Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8684634Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8684841Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8684998Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8685216Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8685321Z self.run() 2022-11-23T02:58:19.8685529Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8685678Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8686007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8686140Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8686504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8686627Z getattr(self, test_name)() 2022-11-23T02:58:19.8686990Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8687092Z fn() 2022-11-23T02:58:19.8687460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8687584Z test(self, **param_kwargs) 2022-11-23T02:58:19.8687929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8688056Z return func(*args, **kwargs) 2022-11-23T02:58:19.8688306Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8688420Z self.run_subtests( 2022-11-23T02:58:19.8688775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8688937Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8689301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8689457Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8689819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8689940Z output = model(*input) 2022-11-23T02:58:19.8690322Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8690474Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8690858Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8691035Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8691411Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8691533Z _lazy_init(state, module) 2022-11-23T02:58:19.8691868Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8692067Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8692412Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8692541Z return func(*args, **kwargs) 2022-11-23T02:58:19.8692926Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8693029Z p_assert( 2022-11-23T02:58:19.8693367Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8693495Z traceback.print_stack() 2022-11-23T02:58:19.8693884Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.8694015Z File "", line 1, in 2022-11-23T02:58:19.8694228Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8694374Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8694579Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8694732Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8694950Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8695054Z self.run() 2022-11-23T02:58:19.8695242Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8695392Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8695739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8695872Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8696240Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8696364Z getattr(self, test_name)() 2022-11-23T02:58:19.8696732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8696813Z fn() 2022-11-23T02:58:19.8697187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8697310Z test(self, **param_kwargs) 2022-11-23T02:58:19.8697675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8697800Z return func(*args, **kwargs) 2022-11-23T02:58:19.8698052Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8698166Z self.run_subtests( 2022-11-23T02:58:19.8698522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8698667Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8699036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8699188Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8699615Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8699742Z output = model(*input) 2022-11-23T02:58:19.8700075Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8700216Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8700599Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8700778Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8701134Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8701305Z _lazy_init(state, module) 2022-11-23T02:58:19.8701662Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8701806Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8702154Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8702280Z return func(*args, **kwargs) 2022-11-23T02:58:19.8702666Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8702771Z p_assert( 2022-11-23T02:58:19.8703094Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8703220Z traceback.print_stack() 2022-11-23T02:58:19.8703471Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:58:19.8703722Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:58:19.8704129Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.8704265Z File "", line 1, in 2022-11-23T02:58:19.8704479Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8704624Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8704810Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8704961Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8705180Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8705284Z self.run() 2022-11-23T02:58:19.8705488Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8705636Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8705990Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8706105Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8706475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8706599Z getattr(self, test_name)() 2022-11-23T02:58:19.8706964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8707061Z fn() 2022-11-23T02:58:19.8707430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8707552Z test(self, **param_kwargs) 2022-11-23T02:58:19.8707913Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8708022Z return func(*args, **kwargs) 2022-11-23T02:58:19.8708275Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8708390Z self.run_subtests( 2022-11-23T02:58:19.8708797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8709192Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8709575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8709731Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8710111Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8710213Z output = model(*input) 2022-11-23T02:58:19.8710543Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8710767Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8711153Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8711330Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8711706Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8711830Z _lazy_init(state, module) 2022-11-23T02:58:19.8712189Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8712314Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8712658Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8712783Z return func(*args, **kwargs) 2022-11-23T02:58:19.8713167Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8713273Z p_assert( 2022-11-23T02:58:19.8713619Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8713747Z traceback.print_stack() 2022-11-23T02:58:19.8714158Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.8714270Z File "", line 1, in 2022-11-23T02:58:19.8714487Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8714631Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8714837Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8714989Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8715205Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8715312Z self.run() 2022-11-23T02:58:19.8715518Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8715648Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8716000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8716134Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8716504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8716628Z getattr(self, test_name)() 2022-11-23T02:58:19.8716994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8717092Z fn() 2022-11-23T02:58:19.8717445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8717569Z test(self, **param_kwargs) 2022-11-23T02:58:19.8717936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8718062Z return func(*args, **kwargs) 2022-11-23T02:58:19.8718371Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8718493Z self.run_subtests( 2022-11-23T02:58:19.8718855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8719021Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8719371Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8719525Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8719905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8720077Z output = model(*input) 2022-11-23T02:58:19.8720410Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8720552Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8720938Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8721117Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8721487Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8721591Z _lazy_init(state, module) 2022-11-23T02:58:19.8721950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8722094Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8722440Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8722568Z return func(*args, **kwargs) 2022-11-23T02:58:19.8722952Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8723056Z p_assert( 2022-11-23T02:58:19.8723403Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8723514Z traceback.print_stack() 2022-11-23T02:58:19.8723761Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:58:19.8724010Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:58:19.8724415Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.8724545Z File "", line 1, in 2022-11-23T02:58:19.8724762Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8724911Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8725120Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8725254Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8725471Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8725576Z self.run() 2022-11-23T02:58:19.8725782Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8725928Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8726277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8726410Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8726761Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8726886Z getattr(self, test_name)() 2022-11-23T02:58:19.8727258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8727356Z fn() 2022-11-23T02:58:19.8727774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8727903Z test(self, **param_kwargs) 2022-11-23T02:58:19.8728265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8728390Z return func(*args, **kwargs) 2022-11-23T02:58:19.8728624Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8728740Z self.run_subtests( 2022-11-23T02:58:19.8729099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8729260Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8729695Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8729847Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8730229Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8730351Z output = model(*input) 2022-11-23T02:58:19.8730665Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8730808Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8731193Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8731368Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8731739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8731863Z _lazy_init(state, module) 2022-11-23T02:58:19.8732221Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8732366Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8732690Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8732817Z return func(*args, **kwargs) 2022-11-23T02:58:19.8733202Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8733303Z p_assert( 2022-11-23T02:58:19.8733643Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8733769Z traceback.print_stack() 2022-11-23T02:58:19.8734175Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.8734311Z File "", line 1, in 2022-11-23T02:58:19.8734506Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8734649Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8734858Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8735011Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8735226Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8735329Z self.run() 2022-11-23T02:58:19.8735534Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8735682Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8736014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8736149Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8736522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8736647Z getattr(self, test_name)() 2022-11-23T02:58:19.8737062Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8737168Z fn() 2022-11-23T02:58:19.8737542Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8737647Z test(self, **param_kwargs) 2022-11-23T02:58:19.8738011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8738137Z return func(*args, **kwargs) 2022-11-23T02:58:19.8738387Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8738499Z self.run_subtests( 2022-11-23T02:58:19.8738855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8739072Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8739448Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8739585Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8739968Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8740088Z output = model(*input) 2022-11-23T02:58:19.8740419Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8740559Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8740949Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8741127Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8741502Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8741623Z _lazy_init(state, module) 2022-11-23T02:58:19.8741967Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8742111Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8742453Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8742578Z return func(*args, **kwargs) 2022-11-23T02:58:19.8742960Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8743062Z p_assert( 2022-11-23T02:58:19.8743404Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8743535Z traceback.print_stack() 2022-11-23T02:58:19.8743770Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:58:19.8744009Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:58:19.8744416Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.8744549Z File "", line 1, in 2022-11-23T02:58:19.8744766Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8744908Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8745114Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8745268Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8745466Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8745574Z self.run() 2022-11-23T02:58:19.8745781Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8745929Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8746324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8746463Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8746832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8746938Z getattr(self, test_name)() 2022-11-23T02:58:19.8747302Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8747400Z fn() 2022-11-23T02:58:19.8747772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8747895Z test(self, **param_kwargs) 2022-11-23T02:58:19.8748311Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8748436Z return func(*args, **kwargs) 2022-11-23T02:58:19.8748690Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8748787Z self.run_subtests( 2022-11-23T02:58:19.8749378Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8749547Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8749920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8750105Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8750488Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8750612Z output = model(*input) 2022-11-23T02:58:19.8750940Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8751063Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8751453Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8751634Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8752004Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8752127Z _lazy_init(state, module) 2022-11-23T02:58:19.8752485Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8752629Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8752972Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8753081Z return func(*args, **kwargs) 2022-11-23T02:58:19.8753466Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8753570Z p_assert( 2022-11-23T02:58:19.8753914Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8754041Z traceback.print_stack() 2022-11-23T02:58:19.8754444Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.8754574Z File "", line 1, in 2022-11-23T02:58:19.8754789Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8754914Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8755117Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8755271Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8755493Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8755595Z self.run() 2022-11-23T02:58:19.8755800Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8756062Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8756423Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8756539Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8756908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8757032Z getattr(self, test_name)() 2022-11-23T02:58:19.8757397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8757496Z fn() 2022-11-23T02:58:19.8757866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8758057Z test(self, **param_kwargs) 2022-11-23T02:58:19.8758403Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8758533Z return func(*args, **kwargs) 2022-11-23T02:58:19.8758786Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8758899Z self.run_subtests( 2022-11-23T02:58:19.8759256Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8759419Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8759785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8759940Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8760321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8760424Z output = model(*input) 2022-11-23T02:58:19.8760758Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8760899Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8761281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8761459Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8761830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8761950Z _lazy_init(state, module) 2022-11-23T02:58:19.8762308Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8762438Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8762781Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8762908Z return func(*args, **kwargs) 2022-11-23T02:58:19.8763294Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8763399Z p_assert( 2022-11-23T02:58:19.8763741Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8763872Z traceback.print_stack() 2022-11-23T02:58:19.8764103Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:58:19.8764346Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:58:19.8764751Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.8764885Z File "", line 1, in 2022-11-23T02:58:19.8765100Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8765243Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8765497Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8765657Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8765856Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8765962Z self.run() 2022-11-23T02:58:19.8766168Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8766315Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8766663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8766798Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8767219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8767344Z getattr(self, test_name)() 2022-11-23T02:58:19.8767692Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8767793Z fn() 2022-11-23T02:58:19.8768163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8768287Z test(self, **param_kwargs) 2022-11-23T02:58:19.8768648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8768773Z return func(*args, **kwargs) 2022-11-23T02:58:19.8769022Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8769136Z self.run_subtests( 2022-11-23T02:58:19.8769477Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8769645Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8770013Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8770166Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8770544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8770665Z output = model(*input) 2022-11-23T02:58:19.8770994Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8771136Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8771500Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8771679Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8772054Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8772176Z _lazy_init(state, module) 2022-11-23T02:58:19.8772534Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8772677Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8773021Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8773147Z return func(*args, **kwargs) 2022-11-23T02:58:19.8773510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8773616Z p_assert( 2022-11-23T02:58:19.8773957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8774087Z traceback.print_stack() 2022-11-23T02:58:19.8774492Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.8774622Z File "", line 1, in 2022-11-23T02:58:19.8774883Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8775032Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8775219Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8775371Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8775586Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8775691Z self.run() 2022-11-23T02:58:19.8775896Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8776044Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8776387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8776571Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8776920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8777049Z getattr(self, test_name)() 2022-11-23T02:58:19.8777413Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8777512Z fn() 2022-11-23T02:58:19.8777882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8778003Z test(self, **param_kwargs) 2022-11-23T02:58:19.8778365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8778493Z return func(*args, **kwargs) 2022-11-23T02:58:19.8778725Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8778843Z self.run_subtests( 2022-11-23T02:58:19.8779198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8779362Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8779728Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8779882Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8780262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8780384Z output = model(*input) 2022-11-23T02:58:19.8780699Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8780840Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8781225Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8781405Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8781780Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8781901Z _lazy_init(state, module) 2022-11-23T02:58:19.8782259Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8782402Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8782728Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8782853Z return func(*args, **kwargs) 2022-11-23T02:58:19.8783236Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8783341Z p_assert( 2022-11-23T02:58:19.8783682Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8783808Z traceback.print_stack() 2022-11-23T02:58:19.8784108Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:58:19.8784363Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:58:19.8784756Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.8784889Z File "", line 1, in 2022-11-23T02:58:19.8785103Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8785247Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8785455Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8785608Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8785872Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8785959Z self.run() 2022-11-23T02:58:19.8786168Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8786321Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8786668Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8786801Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8787172Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8787298Z getattr(self, test_name)() 2022-11-23T02:58:19.8787663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8787744Z fn() 2022-11-23T02:58:19.8788117Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8788245Z test(self, **param_kwargs) 2022-11-23T02:58:19.8788610Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8788739Z return func(*args, **kwargs) 2022-11-23T02:58:19.8789284Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8789410Z self.run_subtests( 2022-11-23T02:58:19.8789779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8789928Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8790297Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8790449Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8790834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8790955Z output = model(*input) 2022-11-23T02:58:19.8791287Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8791427Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8791810Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8791970Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8792341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8792462Z _lazy_init(state, module) 2022-11-23T02:58:19.8792816Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8792967Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8793311Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8793435Z return func(*args, **kwargs) 2022-11-23T02:58:19.8793893Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8793988Z p_assert( 2022-11-23T02:58:19.8794336Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8794464Z traceback.print_stack() 2022-11-23T02:58:19.8794872Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.8795004Z File "", line 1, in 2022-11-23T02:58:19.8795215Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8795361Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8795646Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8795781Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8796001Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8796109Z self.run() 2022-11-23T02:58:19.8796316Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8796463Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8796811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8796945Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8797313Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8797420Z getattr(self, test_name)() 2022-11-23T02:58:19.8797782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8797885Z fn() 2022-11-23T02:58:19.8798257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8798383Z test(self, **param_kwargs) 2022-11-23T02:58:19.8798744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8798869Z return func(*args, **kwargs) 2022-11-23T02:58:19.8799120Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8799217Z self.run_subtests( 2022-11-23T02:58:19.8799573Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8799736Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8800106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8800263Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8800642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8800767Z output = model(*input) 2022-11-23T02:58:19.8801100Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8801224Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8801603Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8801783Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8802154Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8802276Z _lazy_init(state, module) 2022-11-23T02:58:19.8802637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8802780Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8803170Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8803283Z return func(*args, **kwargs) 2022-11-23T02:58:19.8803675Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8803780Z p_assert( 2022-11-23T02:58:19.8804123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8804251Z traceback.print_stack() 2022-11-23T02:58:19.8804502Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:58:19.8804747Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:58:19.8805204Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.8805318Z File "", line 1, in 2022-11-23T02:58:19.8805537Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8805682Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8805884Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8806036Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8806251Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8806356Z self.run() 2022-11-23T02:58:19.8806545Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8806693Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8807040Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8807179Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8807546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8807672Z getattr(self, test_name)() 2022-11-23T02:58:19.8808038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8808136Z fn() 2022-11-23T02:58:19.8808488Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8808614Z test(self, **param_kwargs) 2022-11-23T02:58:19.8808973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8809099Z return func(*args, **kwargs) 2022-11-23T02:58:19.8809350Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8809467Z self.run_subtests( 2022-11-23T02:58:19.8809824Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8809990Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8810340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8810495Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8810872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8810993Z output = model(*input) 2022-11-23T02:58:19.8811322Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8811464Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8811852Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8812030Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8812433Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8812564Z _lazy_init(state, module) 2022-11-23T02:58:19.8812926Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8813070Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8813412Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8813538Z return func(*args, **kwargs) 2022-11-23T02:58:19.8813922Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8814075Z p_assert( 2022-11-23T02:58:19.8814399Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8814526Z traceback.print_stack() 2022-11-23T02:58:19.8814937Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.8815070Z File "", line 1, in 2022-11-23T02:58:19.8815284Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.8815428Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.8815632Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.8815784Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.8815983Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.8816086Z self.run() 2022-11-23T02:58:19.8816292Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.8816445Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.8816791Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.8816924Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.8817293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.8817416Z getattr(self, test_name)() 2022-11-23T02:58:19.8817764Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.8817862Z fn() 2022-11-23T02:58:19.8818237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.8818361Z test(self, **param_kwargs) 2022-11-23T02:58:19.8818721Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.8818849Z return func(*args, **kwargs) 2022-11-23T02:58:19.8819099Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 240, in test_mixture_of_experts 2022-11-23T02:58:19.8819212Z self.run_subtests( 2022-11-23T02:58:19.8819555Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.8819716Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.8820082Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.8820234Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.8820611Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.8820731Z output = model(*input) 2022-11-23T02:58:19.8821060Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.8821205Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.8821569Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.8821794Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.8822175Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.8822300Z _lazy_init(state, module) 2022-11-23T02:58:19.8822656Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.8822799Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.8823147Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.8823273Z return func(*args, **kwargs) 2022-11-23T02:58:19.8823702Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.8823806Z p_assert( 2022-11-23T02:58:19.8824151Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.8824284Z traceback.print_stack() 2022-11-23T02:58:19.8824536Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:58:19.8824778Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:58:19.8825183Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.8825587Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.8825815Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:58:19.8826057Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:58:19.8826461Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.8826866Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.8827109Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:58:19.8827344Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:58:19.8827739Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.8828136Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.8828382Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:58:19.8828599Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:58:19.8829221Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.8829642Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.8830412Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8830661Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:58:19.8830906Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:58:19.8831372Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.8831788Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.8832547Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8833303Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8833611Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:58:19.8833855Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:58:19.8834238Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.8834644Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.8834887Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:58:19.8835128Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:58:19.8835527Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.8835930Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.8836176Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:58:19.8836412Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:58:19.8836805Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.8837202Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.8837426Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:58:19.8837660Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:58:19.8838061Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.8838462Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.8838703Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:58:19.8838942Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:58:19.8839335Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.8839730Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.8839969Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:58:19.8840190Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:58:19.8840631Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.8841038Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.8841799Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8842553Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8842854Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:58:19.8843094Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:58:19.8843495Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.8843893Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.8844135Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T02:58:19.8844379Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T02:58:19.8844760Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:58:19.8845165Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:58:19.8845407Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T02:58:19.8845653Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T02:58:19.8846051Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:58:19.8846441Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:58:19.8846691Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T02:58:19.8846931Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T02:58:19.8847336Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:58:19.8847734Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:58:19.8847962Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T02:58:19.8848199Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T02:58:19.8848595Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:58:19.8848988Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:58:19.8849234Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T02:58:19.8849474Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T02:58:19.8849873Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:58:19.8850380Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:58:19.8851147Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8851902Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8852196Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T02:58:19.8852421Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T02:58:19.8852823Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:58:19.8853218Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:58:19.8853466Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T02:58:19.8853705Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T02:58:19.8854102Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:58:19.8854862Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8855257Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:58:19.8855997Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8856246Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T02:58:19.8856486Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T02:58:19.8856863Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:58:19.8857267Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:58:19.8857512Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T02:58:19.8857758Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T02:58:19.8858156Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:58:19.8858551Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:58:19.8858795Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T02:58:19.8859031Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T02:58:19.8859477Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:58:19.8859878Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:58:19.8860104Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T02:58:19.8860345Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T02:58:19.8860741Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:58:19.8861200Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:58:19.8861963Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8862715Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8862963Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T02:58:19.8863203Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T02:58:19.8863606Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:58:19.8864007Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:58:19.8864121Z dist init r=1, world=2 2022-11-23T02:58:19.8864442Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8864770Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8865085Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8865402Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8865723Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8866036Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8866345Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8866675Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8866994Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8867313Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8867670Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8867967Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.8868080Z dist init r=0, world=2 2022-11-23T02:58:19.8868395Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8868709Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8869307Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8869629Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8869941Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8870252Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8870561Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8870872Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8871183Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8871490Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8871785Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8872094Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.8872195Z ok (7.517s) 2022-11-23T02:58:19.8872563Z test_mixture_of_experts_with_delay_before_free_offload_false_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 88753 2022-11-23T02:58:19.8872793Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 88754 2022-11-23T02:58:19.8873186Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.8873366Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.8873757Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.8873952Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.8874311Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.8874488Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.8874878Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.8875072Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.8875398Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.8875659Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.8876067Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.8876469Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.8876704Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.8876917Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.8878035Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.8878154Z warnings.warn( 2022-11-23T02:58:19.8878400Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:58:19.8879426Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.8879545Z warnings.warn( 2022-11-23T02:58:19.8879798Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:58:19.8880202Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.8880603Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.8880850Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:58:19.8881090Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:58:19.8881466Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.8881868Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.8882110Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:58:19.8882353Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:58:19.8882748Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.8883139Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.8883952Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8884770Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8885020Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:58:19.8885265Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:58:19.8885668Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.8886070Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.8886297Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:58:19.8886593Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:58:19.8886994Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.8887393Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.8887636Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:58:19.8887880Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:58:19.8888283Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.8888677Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.8888919Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:58:19.8889144Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:58:19.8889544Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.8889940Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.8890985Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.8891197Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:58:19.8892219Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:1255: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.8892418Z _ext_post_unflatten_transform(subtensor.view(shape), param_extension) 2022-11-23T02:58:19.8892662Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:58:19.8892909Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:58:19.8893308Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.8893703Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.8893994Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:58:19.8894221Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:58:19.8894625Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.8895025Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.8895270Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:58:19.8895608Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:58:19.8896007Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.8896413Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.8896655Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:58:19.8896891Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:58:19.8897271Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.8897672Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.8898433Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8898684Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:58:19.8898920Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:58:19.8899316Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.8899712Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.8899955Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:58:19.8900196Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:58:19.8900595Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.8901354Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8902108Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8902855Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8903288Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.8904329Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py:466: UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/python_variable.cpp:318.) 2022-11-23T02:58:19.8904616Z p.detach().reshape(-1) if isinstance(p, nn.Parameter) else p.reshape(-1) 2022-11-23T02:58:19.8904861Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:58:19.8905151Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:58:19.8905554Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.8905957Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.8906201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:58:19.8906439Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:58:19.8906840Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.8907237Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.8907998Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8908228Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:58:19.8908470Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:58:19.8908875Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.8909505Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.8909747Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:58:19.8909989Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:58:19.8910398Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.8910792Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.8911040Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:58:19.8911263Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:58:19.8911664Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.8912058Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.8912303Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:58:19.8912539Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:58:19.8913011Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.8913417Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.8914174Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8914423Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:58:19.8914725Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:58:19.8915130Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.8915512Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.8915753Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:58:19.8915990Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:58:19.8916388Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.8916783Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.8917030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:58:19.8917268Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:58:19.8917669Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.8918062Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.8918286Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:58:19.8918522Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:58:19.8918919Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.8919316Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.8920074Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8920320Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:58:19.8920714Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.8920959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:58:19.8921354Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.8921471Z dist init r=0, world=2 2022-11-23T02:58:19.8921561Z dist init r=1, world=2 2022-11-23T02:58:19.8921661Z ok (29.451s) 2022-11-23T02:58:19.8922071Z test_mixture_of_experts_with_delay_before_free_offload_false_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89076 2022-11-23T02:58:19.8922304Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89077 2022-11-23T02:58:19.8922688Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.8922865Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.8923251Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.8923445Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.8923866Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.8924046Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.8924437Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.8924630Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.8924876Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.8925122Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.8925526Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.8925926Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.8926165Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.8926380Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.8927417Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.8927534Z warnings.warn( 2022-11-23T02:58:19.8928564Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.8928680Z warnings.warn( 2022-11-23T02:58:19.8928930Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:58:19.8929176Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:58:19.8929579Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.8929979Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.8930222Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:58:19.8930467Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:58:19.8930851Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.8931294Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.8931542Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:58:19.8931785Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:58:19.8932184Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.8932580Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.8933344Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8934158Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8934406Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:58:19.8934651Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:58:19.8935047Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.8935448Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.8935672Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:58:19.8935921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:58:19.8936313Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.8936707Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.8936948Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:58:19.8937187Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:58:19.8937588Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.8937984Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.8938739Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8939486Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8939731Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:58:19.8939958Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:58:19.8940403Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.8940812Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.8941057Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:58:19.8941301Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:58:19.8941692Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.8942086Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.8942381Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:58:19.8942622Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:58:19.8943009Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.8943412Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.8944160Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8944898Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8945153Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:58:19.8945399Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:58:19.8945795Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.8946193Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.8946440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:58:19.8946677Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:58:19.8947084Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.8947466Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.8947711Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:58:19.8947949Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:58:19.8948346Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.8948741Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.8949715Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8950038Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:58:19.8950323Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:58:19.8950724Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.8951121Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.8951875Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8952694Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8952945Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:58:19.8953165Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:58:19.8953566Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.8953962Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.8954712Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8954960Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:58:19.8955200Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:58:19.8955597Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.8955995Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.8956235Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:58:19.8956476Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:58:19.8956851Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.8957248Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.8957493Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:58:19.8957733Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:58:19.8958132Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.8958529Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.8959331Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8959587Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:58:19.8959829Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:58:19.8960230Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.8960607Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.8960853Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:58:19.8961140Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:58:19.8961537Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.8961939Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.8962690Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8962936Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:58:19.8963174Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:58:19.8963577Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.8963977Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.8964201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:58:19.8964441Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:58:19.8964840Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.8965234Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.8965987Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8966241Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:58:19.8966481Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:58:19.8966879Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.8967277Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.8967519Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:58:19.8967758Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:58:19.8968144Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.8968586Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.8968835Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:58:19.8969071Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:58:19.8969467Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.8969864Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.8969978Z dist init r=0, world=2 2022-11-23T02:58:19.8970087Z dist init r=1, world=2 2022-11-23T02:58:19.8970218Z ok (25.244s) 2022-11-23T02:58:19.8970593Z test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89399 2022-11-23T02:58:19.8970820Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89400 2022-11-23T02:58:19.8971202Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.8971382Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.8971770Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.8971965Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.8972340Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.8972519Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.8972884Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.8973080Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.8973325Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.8973572Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.8973973Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.8974373Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.8974606Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.8974843Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.8975878Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.8975995Z warnings.warn( 2022-11-23T02:58:19.8976220Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:58:19.8977243Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.8977359Z warnings.warn( 2022-11-23T02:58:19.8977646Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:58:19.8978059Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.8978459Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.8978708Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:58:19.8979106Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.8979399Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:58:19.8979798Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.8980030Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:58:19.8980268Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:58:19.8980660Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.8981058Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.8981816Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8982563Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8982806Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:58:19.8983052Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:58:19.8983449Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.8983847Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.8984094Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:58:19.8984342Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:58:19.8984724Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.8985118Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.8985358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:58:19.8985598Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:58:19.8985989Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.8986385Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.8987191Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8987950Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8988194Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:58:19.8988437Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:58:19.8988905Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.8989507Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.8989753Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:58:19.8989997Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:58:19.8990388Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.8990781Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.8991027Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:58:19.8991268Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:58:19.8991675Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.8992074Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.8992824Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8993577Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8993814Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:58:19.8994061Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:58:19.8994463Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.8994862Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.8995109Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:58:19.8995351Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:58:19.8995756Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.8996158Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.8996473Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:58:19.8996702Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:58:19.8997107Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.8997501Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.8998255Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.8998563Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:58:19.8998803Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:58:19.8999205Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.8999605Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.9000354Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9001104Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9001348Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:58:19.9001583Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:58:19.9001963Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.9002361Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.9003106Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9003359Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:58:19.9003757Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.9004001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:58:19.9004401Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.9004647Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:58:19.9004886Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:58:19.9005293Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.9005687Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.9005960Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:58:19.9006207Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:58:19.9006607Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.9007003Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.9007762Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9008067Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:58:19.9008310Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:58:19.9008705Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.9009105Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.9009349Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:58:19.9009570Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:58:19.9009976Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.9010374Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.9011133Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9011378Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:58:19.9011616Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:58:19.9012015Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.9012416Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.9012663Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:58:19.9012902Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:58:19.9013281Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.9013674Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.9014421Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9014670Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:58:19.9014951Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:58:19.9015355Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.9015748Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.9015993Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:58:19.9016229Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:58:19.9016633Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.9017062Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.9017307Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:58:19.9017547Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:58:19.9017942Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.9018336Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.9018450Z dist init r=0, world=2 2022-11-23T02:58:19.9018558Z dist init r=1, world=2 2022-11-23T02:58:19.9018659Z ok (29.653s) 2022-11-23T02:58:19.9019026Z test_mixture_of_experts_with_delay_before_free_offload_true_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 89722 2022-11-23T02:58:19.9019236Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 89723 2022-11-23T02:58:19.9019620Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.9019799Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.9020187Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.9020382Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.9020755Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.9020929Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.9021308Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.9021483Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.9021731Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.9021980Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.9022382Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.9022779Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.9023014Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.9023247Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.9024324Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.9024449Z warnings.warn( 2022-11-23T02:58:19.9024696Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:58:19.9025715Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.9025888Z warnings.warn( 2022-11-23T02:58:19.9026116Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:58:19.9026521Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.9026653Z File "", line 1, in 2022-11-23T02:58:19.9026871Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9027016Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9027223Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9027376Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9027592Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9027679Z self.run() 2022-11-23T02:58:19.9027890Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9028037Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9028385Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9028522Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9028893Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9029218Z getattr(self, test_name)() 2022-11-23T02:58:19.9029582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9029686Z fn() 2022-11-23T02:58:19.9030059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9030185Z test(self, **param_kwargs) 2022-11-23T02:58:19.9030546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9030676Z return func(*args, **kwargs) 2022-11-23T02:58:19.9030960Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9031077Z self.run_subtests( 2022-11-23T02:58:19.9031421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9031585Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9031957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9032111Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9032490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9032610Z output = model(*input) 2022-11-23T02:58:19.9032945Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9033088Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9033523Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9033712Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9034088Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9034211Z _lazy_init(state, module) 2022-11-23T02:58:19.9034573Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9034718Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9035060Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9035244Z return func(*args, **kwargs) 2022-11-23T02:58:19.9035633Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9035718Z p_assert( 2022-11-23T02:58:19.9036065Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9036193Z traceback.print_stack() 2022-11-23T02:58:19.9036602Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.9036732Z File "", line 1, in 2022-11-23T02:58:19.9036943Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9037085Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9037273Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9037426Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9037646Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9037750Z self.run() 2022-11-23T02:58:19.9037953Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9038104Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9038450Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9038584Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9038935Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9039059Z getattr(self, test_name)() 2022-11-23T02:58:19.9039427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9039526Z fn() 2022-11-23T02:58:19.9039899Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9040026Z test(self, **param_kwargs) 2022-11-23T02:58:19.9040388Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9040521Z return func(*args, **kwargs) 2022-11-23T02:58:19.9040785Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9040901Z self.run_subtests( 2022-11-23T02:58:19.9041259Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9041423Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9041793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9041947Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9042332Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9042453Z output = model(*input) 2022-11-23T02:58:19.9042858Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9043008Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9043394Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9043575Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9043945Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9044068Z _lazy_init(state, module) 2022-11-23T02:58:19.9044424Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9044620Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9044947Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9045075Z return func(*args, **kwargs) 2022-11-23T02:58:19.9045464Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9045568Z p_assert( 2022-11-23T02:58:19.9045909Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9046035Z traceback.print_stack() 2022-11-23T02:58:19.9046283Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:58:19.9046535Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:58:19.9046922Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.9047058Z File "", line 1, in 2022-11-23T02:58:19.9047272Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9047416Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9047625Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9047778Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9047994Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9048099Z self.run() 2022-11-23T02:58:19.9048288Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9048436Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9048784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9048918Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9049290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9049415Z getattr(self, test_name)() 2022-11-23T02:58:19.9049782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9049880Z fn() 2022-11-23T02:58:19.9050274Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9050400Z test(self, **param_kwargs) 2022-11-23T02:58:19.9050762Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9050888Z return func(*args, **kwargs) 2022-11-23T02:58:19.9051172Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9051287Z self.run_subtests( 2022-11-23T02:58:19.9051649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9051813Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9052225Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9052388Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9052775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9052896Z output = model(*input) 2022-11-23T02:58:19.9053229Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9053372Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9053756Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9053998Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9054358Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9054480Z _lazy_init(state, module) 2022-11-23T02:58:19.9054840Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9054989Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9055335Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9055460Z return func(*args, **kwargs) 2022-11-23T02:58:19.9055841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9055926Z p_assert( 2022-11-23T02:58:19.9056271Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9056403Z traceback.print_stack() 2022-11-23T02:58:19.9056811Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.9056944Z File "", line 1, in 2022-11-23T02:58:19.9057163Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9057307Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9057513Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9057650Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9057868Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9057974Z self.run() 2022-11-23T02:58:19.9058180Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9058324Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9058674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9058807Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9059152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9059281Z getattr(self, test_name)() 2022-11-23T02:58:19.9059643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9059742Z fn() 2022-11-23T02:58:19.9060110Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9060233Z test(self, **param_kwargs) 2022-11-23T02:58:19.9060593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9060719Z return func(*args, **kwargs) 2022-11-23T02:58:19.9060982Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9061101Z self.run_subtests( 2022-11-23T02:58:19.9061460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9061671Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9062051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9062205Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9062587Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9062707Z output = model(*input) 2022-11-23T02:58:19.9063020Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9063165Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9063598Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9063776Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9064152Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9064275Z _lazy_init(state, module) 2022-11-23T02:58:19.9064632Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9064779Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9065103Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9065228Z return func(*args, **kwargs) 2022-11-23T02:58:19.9065614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9065720Z p_assert( 2022-11-23T02:58:19.9066061Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9066188Z traceback.print_stack() 2022-11-23T02:58:19.9066442Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:58:19.9066691Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:58:19.9067097Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.9067212Z File "", line 1, in 2022-11-23T02:58:19.9067426Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9067569Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9067777Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9067933Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9068150Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9068256Z self.run() 2022-11-23T02:58:19.9068445Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9068595Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9069103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9069248Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9069622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9069748Z getattr(self, test_name)() 2022-11-23T02:58:19.9070113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9070213Z fn() 2022-11-23T02:58:19.9070570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9070693Z test(self, **param_kwargs) 2022-11-23T02:58:19.9071125Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9071261Z return func(*args, **kwargs) 2022-11-23T02:58:19.9071545Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9071660Z self.run_subtests( 2022-11-23T02:58:19.9072020Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9072185Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9072536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9072753Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9073137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9073259Z output = model(*input) 2022-11-23T02:58:19.9073593Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9073736Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9074118Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9074295Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9074649Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9074772Z _lazy_init(state, module) 2022-11-23T02:58:19.9075127Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9075272Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9075614Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9075740Z return func(*args, **kwargs) 2022-11-23T02:58:19.9076126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9076230Z p_assert( 2022-11-23T02:58:19.9076555Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9076683Z traceback.print_stack() 2022-11-23T02:58:19.9077089Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.9077218Z File "", line 1, in 2022-11-23T02:58:19.9077433Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9077581Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9077789Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9077943Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9078146Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9078252Z self.run() 2022-11-23T02:58:19.9078458Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9078606Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9078953Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9079086Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9079453Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9079575Z getattr(self, test_name)() 2022-11-23T02:58:19.9079929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9080028Z fn() 2022-11-23T02:58:19.9080443Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9080574Z test(self, **param_kwargs) 2022-11-23T02:58:19.9080942Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9081069Z return func(*args, **kwargs) 2022-11-23T02:58:19.9081353Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9081448Z self.run_subtests( 2022-11-23T02:58:19.9081808Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9081972Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9082393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9082548Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9082934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9083057Z output = model(*input) 2022-11-23T02:58:19.9083387Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9083511Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9083894Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9084073Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9084447Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9084573Z _lazy_init(state, module) 2022-11-23T02:58:19.9084931Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9085076Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9085422Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9085548Z return func(*args, **kwargs) 2022-11-23T02:58:19.9085919Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9086022Z p_assert( 2022-11-23T02:58:19.9086365Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9086492Z traceback.print_stack() 2022-11-23T02:58:19.9086741Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:58:19.9086994Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:58:19.9087405Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.9087539Z File "", line 1, in 2022-11-23T02:58:19.9087736Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9087878Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9088084Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9088237Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9088454Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9088558Z self.run() 2022-11-23T02:58:19.9088765Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9088898Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9089248Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9089386Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9089803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9089935Z getattr(self, test_name)() 2022-11-23T02:58:19.9090306Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9090408Z fn() 2022-11-23T02:58:19.9090781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9090885Z test(self, **param_kwargs) 2022-11-23T02:58:19.9091251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9091378Z return func(*args, **kwargs) 2022-11-23T02:58:19.9091709Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9091824Z self.run_subtests( 2022-11-23T02:58:19.9092190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9092355Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9092723Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9092859Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9093239Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9093360Z output = model(*input) 2022-11-23T02:58:19.9093691Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9093837Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9094213Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9094392Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9094766Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9094871Z _lazy_init(state, module) 2022-11-23T02:58:19.9095230Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9095373Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9095711Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9095839Z return func(*args, **kwargs) 2022-11-23T02:58:19.9096223Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9096330Z p_assert( 2022-11-23T02:58:19.9096671Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9096780Z traceback.print_stack() 2022-11-23T02:58:19.9097190Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.9097321Z File "", line 1, in 2022-11-23T02:58:19.9097536Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9097679Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9097884Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9098037Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9098255Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9098344Z self.run() 2022-11-23T02:58:19.9098549Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9098696Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9099086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9099226Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9099597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9099722Z getattr(self, test_name)() 2022-11-23T02:58:19.9100086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9100166Z fn() 2022-11-23T02:58:19.9100534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9100657Z test(self, **param_kwargs) 2022-11-23T02:58:19.9101074Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9101200Z return func(*args, **kwargs) 2022-11-23T02:58:19.9101487Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9101604Z self.run_subtests( 2022-11-23T02:58:19.9101963Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9102110Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9102476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9102629Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9103007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9103133Z output = model(*input) 2022-11-23T02:58:19.9103465Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9103608Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9103995Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9104156Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9104529Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9104648Z _lazy_init(state, module) 2022-11-23T02:58:19.9105005Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9105148Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9105488Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9105617Z return func(*args, **kwargs) 2022-11-23T02:58:19.9106001Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9106086Z p_assert( 2022-11-23T02:58:19.9106429Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9106555Z traceback.print_stack() 2022-11-23T02:58:19.9106802Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:58:19.9107049Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:58:19.9107454Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.9107585Z File "", line 1, in 2022-11-23T02:58:19.9107803Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9107929Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9108134Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9108334Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9108557Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9108662Z self.run() 2022-11-23T02:58:19.9108867Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9109187Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9109532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9109675Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9110047Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9110244Z getattr(self, test_name)() 2022-11-23T02:58:19.9110613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9110712Z fn() 2022-11-23T02:58:19.9111087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9111212Z test(self, **param_kwargs) 2022-11-23T02:58:19.9111556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9111683Z return func(*args, **kwargs) 2022-11-23T02:58:19.9111967Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9112082Z self.run_subtests( 2022-11-23T02:58:19.9112442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9112612Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9112985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9113139Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9113506Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9113627Z output = model(*input) 2022-11-23T02:58:19.9113957Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9114100Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9114483Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9114663Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9115034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9115160Z _lazy_init(state, module) 2022-11-23T02:58:19.9115517Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9115646Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9115989Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9116113Z return func(*args, **kwargs) 2022-11-23T02:58:19.9116496Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9116599Z p_assert( 2022-11-23T02:58:19.9116939Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9117066Z traceback.print_stack() 2022-11-23T02:58:19.9117454Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.9117591Z File "", line 1, in 2022-11-23T02:58:19.9117807Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9118009Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9118226Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9118379Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9118598Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9118703Z self.run() 2022-11-23T02:58:19.9118891Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9119040Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9119387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9119583Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9119953Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9120077Z getattr(self, test_name)() 2022-11-23T02:58:19.9120446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9120545Z fn() 2022-11-23T02:58:19.9120897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9121022Z test(self, **param_kwargs) 2022-11-23T02:58:19.9121384Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9121511Z return func(*args, **kwargs) 2022-11-23T02:58:19.9121794Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9121913Z self.run_subtests( 2022-11-23T02:58:19.9122272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9122436Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9122789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9122943Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9123322Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9123444Z output = model(*input) 2022-11-23T02:58:19.9123775Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9123919Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9124300Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9124478Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9124834Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9124958Z _lazy_init(state, module) 2022-11-23T02:58:19.9125313Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9125457Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9125801Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9125927Z return func(*args, **kwargs) 2022-11-23T02:58:19.9126312Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9126415Z p_assert( 2022-11-23T02:58:19.9126735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9126868Z traceback.print_stack() 2022-11-23T02:58:19.9127118Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:58:19.9127417Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:58:19.9127834Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.9127966Z File "", line 1, in 2022-11-23T02:58:19.9128181Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9128325Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9128515Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9128668Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9128882Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9129035Z self.run() 2022-11-23T02:58:19.9129241Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9129391Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9129741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9129877Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9130228Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9130355Z getattr(self, test_name)() 2022-11-23T02:58:19.9130717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9130816Z fn() 2022-11-23T02:58:19.9131186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9131314Z test(self, **param_kwargs) 2022-11-23T02:58:19.9131674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9131781Z return func(*args, **kwargs) 2022-11-23T02:58:19.9132067Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9132180Z self.run_subtests( 2022-11-23T02:58:19.9132539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9132702Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9133073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9133229Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9133611Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9133735Z output = model(*input) 2022-11-23T02:58:19.9134048Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9134194Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9134581Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9134760Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9135132Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9135254Z _lazy_init(state, module) 2022-11-23T02:58:19.9135612Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9135756Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9136084Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9136208Z return func(*args, **kwargs) 2022-11-23T02:58:19.9136635Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9136743Z p_assert( 2022-11-23T02:58:19.9137088Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9137216Z traceback.print_stack() 2022-11-23T02:58:19.9137619Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.9137750Z File "", line 1, in 2022-11-23T02:58:19.9137950Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9138092Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9138387Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9138542Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9138760Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9138864Z self.run() 2022-11-23T02:58:19.9139074Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9139205Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9139552Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9139687Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9140054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9140180Z getattr(self, test_name)() 2022-11-23T02:58:19.9140544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9140648Z fn() 2022-11-23T02:58:19.9141017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9141123Z test(self, **param_kwargs) 2022-11-23T02:58:19.9141486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9141614Z return func(*args, **kwargs) 2022-11-23T02:58:19.9141896Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9142009Z self.run_subtests( 2022-11-23T02:58:19.9142370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9142537Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9142907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9143045Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9143426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9143548Z output = model(*input) 2022-11-23T02:58:19.9143881Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9144024Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9144407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9144585Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9144958Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9145062Z _lazy_init(state, module) 2022-11-23T02:58:19.9145424Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9145567Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9145907Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9146078Z return func(*args, **kwargs) 2022-11-23T02:58:19.9146474Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9146578Z p_assert( 2022-11-23T02:58:19.9146920Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9147028Z traceback.print_stack() 2022-11-23T02:58:19.9147276Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:58:19.9147526Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:58:19.9147986Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.9148118Z File "", line 1, in 2022-11-23T02:58:19.9148340Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9148486Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9148693Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9148829Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9149215Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9149326Z self.run() 2022-11-23T02:58:19.9149540Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9149689Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9150037Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9150208Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9150581Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9150688Z getattr(self, test_name)() 2022-11-23T02:58:19.9151057Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9151156Z fn() 2022-11-23T02:58:19.9151530Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9151654Z test(self, **param_kwargs) 2022-11-23T02:58:19.9152017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9152143Z return func(*args, **kwargs) 2022-11-23T02:58:19.9152425Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9152525Z self.run_subtests( 2022-11-23T02:58:19.9152883Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9153051Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9153420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9153575Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9153959Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9154080Z output = model(*input) 2022-11-23T02:58:19.9154412Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9154537Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9154925Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9155105Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9155550Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9155685Z _lazy_init(state, module) 2022-11-23T02:58:19.9156048Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9156193Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9156533Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9156641Z return func(*args, **kwargs) 2022-11-23T02:58:19.9157029Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9157193Z p_assert( 2022-11-23T02:58:19.9157541Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9157671Z traceback.print_stack() 2022-11-23T02:58:19.9158080Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.9158212Z File "", line 1, in 2022-11-23T02:58:19.9158429Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9158556Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9158763Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9158916Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9159133Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9159240Z self.run() 2022-11-23T02:58:19.9159443Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9159595Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9159922Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9160056Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9160427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9160551Z getattr(self, test_name)() 2022-11-23T02:58:19.9160917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9161016Z fn() 2022-11-23T02:58:19.9161386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9161512Z test(self, **param_kwargs) 2022-11-23T02:58:19.9161856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9161985Z return func(*args, **kwargs) 2022-11-23T02:58:19.9162264Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9162379Z self.run_subtests( 2022-11-23T02:58:19.9162743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9162906Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9163275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9163428Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9163791Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9163914Z output = model(*input) 2022-11-23T02:58:19.9164241Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9164387Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9164769Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9164996Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9165380Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9165503Z _lazy_init(state, module) 2022-11-23T02:58:19.9165843Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9165989Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9166332Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9166459Z return func(*args, **kwargs) 2022-11-23T02:58:19.9166893Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9166997Z p_assert( 2022-11-23T02:58:19.9167339Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9167467Z traceback.print_stack() 2022-11-23T02:58:19.9167698Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:58:19.9167947Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:58:19.9168357Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.9168487Z File "", line 1, in 2022-11-23T02:58:19.9168703Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9168849Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9169060Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9169213Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9169416Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9169522Z self.run() 2022-11-23T02:58:19.9169729Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9169876Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9170220Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9170357Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9170730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9170853Z getattr(self, test_name)() 2022-11-23T02:58:19.9171203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9171306Z fn() 2022-11-23T02:58:19.9171678Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9171806Z test(self, **param_kwargs) 2022-11-23T02:58:19.9172170Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9172298Z return func(*args, **kwargs) 2022-11-23T02:58:19.9172580Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9172698Z self.run_subtests( 2022-11-23T02:58:19.9173035Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9173199Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9173569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9173723Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9174153Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9174281Z output = model(*input) 2022-11-23T02:58:19.9174617Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9174761Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9175123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9175302Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9175675Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9175842Z _lazy_init(state, module) 2022-11-23T02:58:19.9176199Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9176345Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9176690Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9176816Z return func(*args, **kwargs) 2022-11-23T02:58:19.9177185Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9177286Z p_assert( 2022-11-23T02:58:19.9177628Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9177759Z traceback.print_stack() 2022-11-23T02:58:19.9178165Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.9178300Z File "", line 1, in 2022-11-23T02:58:19.9178515Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9178659Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9178851Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9179003Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9179220Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9179325Z self.run() 2022-11-23T02:58:19.9179532Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9179682Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9180034Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9180150Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9180518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9180648Z getattr(self, test_name)() 2022-11-23T02:58:19.9181011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9181108Z fn() 2022-11-23T02:58:19.9181478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9181602Z test(self, **param_kwargs) 2022-11-23T02:58:19.9181960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9182069Z return func(*args, **kwargs) 2022-11-23T02:58:19.9182354Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9182468Z self.run_subtests( 2022-11-23T02:58:19.9182821Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9182990Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9183360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9183561Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9184004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9184108Z output = model(*input) 2022-11-23T02:58:19.9184442Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9184587Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9184971Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9185148Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9185635Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9185757Z _lazy_init(state, module) 2022-11-23T02:58:19.9186116Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9186260Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9186585Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9186712Z return func(*args, **kwargs) 2022-11-23T02:58:19.9187097Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9187200Z p_assert( 2022-11-23T02:58:19.9187540Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9187666Z traceback.print_stack() 2022-11-23T02:58:19.9187921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:58:19.9188148Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:58:19.9188556Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.9188689Z File "", line 1, in 2022-11-23T02:58:19.9188904Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9189262Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9189476Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9189629Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9189849Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9189936Z self.run() 2022-11-23T02:58:19.9190147Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9190295Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9190646Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9190783Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9191153Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9191278Z getattr(self, test_name)() 2022-11-23T02:58:19.9191644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9191724Z fn() 2022-11-23T02:58:19.9192094Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9192217Z test(self, **param_kwargs) 2022-11-23T02:58:19.9192576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9192706Z return func(*args, **kwargs) 2022-11-23T02:58:19.9193056Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9193180Z self.run_subtests( 2022-11-23T02:58:19.9193546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9193691Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9194059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9194214Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9194596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9194775Z output = model(*input) 2022-11-23T02:58:19.9195110Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9195254Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9195634Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9195795Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9196168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9196291Z _lazy_init(state, module) 2022-11-23T02:58:19.9196645Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9196789Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9197130Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9197260Z return func(*args, **kwargs) 2022-11-23T02:58:19.9197647Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9197731Z p_assert( 2022-11-23T02:58:19.9198075Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9198203Z traceback.print_stack() 2022-11-23T02:58:19.9198611Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.9198743Z File "", line 1, in 2022-11-23T02:58:19.9198961Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9199104Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9199310Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9199447Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9199668Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9199774Z self.run() 2022-11-23T02:58:19.9199980Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9200128Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9200474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9200608Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9200976Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9201083Z getattr(self, test_name)() 2022-11-23T02:58:19.9201448Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9201545Z fn() 2022-11-23T02:58:19.9201917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9202047Z test(self, **param_kwargs) 2022-11-23T02:58:19.9202408Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9202590Z return func(*args, **kwargs) 2022-11-23T02:58:19.9202884Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9202980Z self.run_subtests( 2022-11-23T02:58:19.9203343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9203512Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9203881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9204035Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9204469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9204590Z output = model(*input) 2022-11-23T02:58:19.9204922Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9205046Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9205429Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9205607Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9205979Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9206100Z _lazy_init(state, module) 2022-11-23T02:58:19.9206454Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9206602Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9206943Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9207052Z return func(*args, **kwargs) 2022-11-23T02:58:19.9207439Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9207542Z p_assert( 2022-11-23T02:58:19.9207886Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9208010Z traceback.print_stack() 2022-11-23T02:58:19.9208261Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:58:19.9208505Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:58:19.9208909Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.9209026Z File "", line 1, in 2022-11-23T02:58:19.9209241Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9209386Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9209593Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9209746Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9209961Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9210066Z self.run() 2022-11-23T02:58:19.9210254Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9210404Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9210749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9210880Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9211253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9211377Z getattr(self, test_name)() 2022-11-23T02:58:19.9211802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9211909Z fn() 2022-11-23T02:58:19.9212266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9212390Z test(self, **param_kwargs) 2022-11-23T02:58:19.9212747Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9212872Z return func(*args, **kwargs) 2022-11-23T02:58:19.9213157Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9213273Z self.run_subtests( 2022-11-23T02:58:19.9213682Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9213847Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9214202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9214358Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9214740Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9214860Z output = model(*input) 2022-11-23T02:58:19.9215193Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9215338Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9215720Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9215902Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9216259Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9216381Z _lazy_init(state, module) 2022-11-23T02:58:19.9216744Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9216889Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9217233Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9217358Z return func(*args, **kwargs) 2022-11-23T02:58:19.9217746Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9217850Z p_assert( 2022-11-23T02:58:19.9218171Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9218305Z traceback.print_stack() 2022-11-23T02:58:19.9218711Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.9218844Z File "", line 1, in 2022-11-23T02:58:19.9219061Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9219205Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9219412Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9219567Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9219767Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9219872Z self.run() 2022-11-23T02:58:19.9220079Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9220227Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9220580Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9220716Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9221134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9221267Z getattr(self, test_name)() 2022-11-23T02:58:19.9221615Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9221717Z fn() 2022-11-23T02:58:19.9222087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9222213Z test(self, **param_kwargs) 2022-11-23T02:58:19.9222576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9222703Z return func(*args, **kwargs) 2022-11-23T02:58:19.9223036Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9223152Z self.run_subtests( 2022-11-23T02:58:19.9223496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9223662Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9224033Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9224188Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9224568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9224689Z output = model(*input) 2022-11-23T02:58:19.9225021Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9225168Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9225532Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9225711Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9226089Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9226212Z _lazy_init(state, module) 2022-11-23T02:58:19.9226566Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9226712Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9227055Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9227181Z return func(*args, **kwargs) 2022-11-23T02:58:19.9227544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9227651Z p_assert( 2022-11-23T02:58:19.9227995Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9228123Z traceback.print_stack() 2022-11-23T02:58:19.9228376Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:58:19.9228617Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:58:19.9229237Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.9229379Z File "", line 1, in 2022-11-23T02:58:19.9229578Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9229724Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9229933Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9230092Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9230310Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9230415Z self.run() 2022-11-23T02:58:19.9230694Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9230832Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9231187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9231322Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9231691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9231816Z getattr(self, test_name)() 2022-11-23T02:58:19.9232181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9232339Z fn() 2022-11-23T02:58:19.9232715Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9232822Z test(self, **param_kwargs) 2022-11-23T02:58:19.9233188Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9233314Z return func(*args, **kwargs) 2022-11-23T02:58:19.9233597Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9233711Z self.run_subtests( 2022-11-23T02:58:19.9234067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9234230Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9234598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9234737Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9235122Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9235242Z output = model(*input) 2022-11-23T02:58:19.9235576Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9235720Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9236101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9236280Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9236650Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9236773Z _lazy_init(state, module) 2022-11-23T02:58:19.9237115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9237263Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9237603Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9237730Z return func(*args, **kwargs) 2022-11-23T02:58:19.9238112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9238215Z p_assert( 2022-11-23T02:58:19.9238556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9238666Z traceback.print_stack() 2022-11-23T02:58:19.9239072Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.9239203Z File "", line 1, in 2022-11-23T02:58:19.9239416Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9239563Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9239770Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9239922Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9240185Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9240278Z self.run() 2022-11-23T02:58:19.9240489Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9240638Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9240985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9241118Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9241488Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9241662Z getattr(self, test_name)() 2022-11-23T02:58:19.9242028Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9242109Z fn() 2022-11-23T02:58:19.9242486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9242610Z test(self, **param_kwargs) 2022-11-23T02:58:19.9242974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9243098Z return func(*args, **kwargs) 2022-11-23T02:58:19.9243383Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9243497Z self.run_subtests( 2022-11-23T02:58:19.9243852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9244001Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9244370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9244525Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9244910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9245031Z output = model(*input) 2022-11-23T02:58:19.9245363Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9245507Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9245891Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9246051Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9246425Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9246549Z _lazy_init(state, module) 2022-11-23T02:58:19.9246905Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9247052Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9247396Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9247521Z return func(*args, **kwargs) 2022-11-23T02:58:19.9247907Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9247994Z p_assert( 2022-11-23T02:58:19.9248335Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9248462Z traceback.print_stack() 2022-11-23T02:58:19.9248715Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:58:19.9248961Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:58:19.9249369Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.9249546Z File "", line 1, in 2022-11-23T02:58:19.9249772Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9249899Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9250109Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9250306Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9250527Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9250633Z self.run() 2022-11-23T02:58:19.9250841Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9251052Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9251402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9251519Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9251889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9252014Z getattr(self, test_name)() 2022-11-23T02:58:19.9252381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9252478Z fn() 2022-11-23T02:58:19.9252849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9252974Z test(self, **param_kwargs) 2022-11-23T02:58:19.9253334Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9253445Z return func(*args, **kwargs) 2022-11-23T02:58:19.9253729Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9253844Z self.run_subtests( 2022-11-23T02:58:19.9254209Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9254372Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9254741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9254896Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9255276Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9255380Z output = model(*input) 2022-11-23T02:58:19.9255709Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9255856Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9256240Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9256422Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9256794Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9256914Z _lazy_init(state, module) 2022-11-23T02:58:19.9257273Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9257399Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9257742Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9257868Z return func(*args, **kwargs) 2022-11-23T02:58:19.9258259Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9258363Z p_assert( 2022-11-23T02:58:19.9258703Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9258879Z traceback.print_stack() 2022-11-23T02:58:19.9259298Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.9259414Z File "", line 1, in 2022-11-23T02:58:19.9259626Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9259770Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9259978Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9260132Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9260349Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9260504Z self.run() 2022-11-23T02:58:19.9260692Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9260841Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9261189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9261323Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9261696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9261821Z getattr(self, test_name)() 2022-11-23T02:58:19.9262185Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9262283Z fn() 2022-11-23T02:58:19.9262633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9262760Z test(self, **param_kwargs) 2022-11-23T02:58:19.9263122Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9263248Z return func(*args, **kwargs) 2022-11-23T02:58:19.9263534Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9263650Z self.run_subtests( 2022-11-23T02:58:19.9264009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9264172Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9264525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9264680Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9265063Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9265189Z output = model(*input) 2022-11-23T02:58:19.9265517Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9265659Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9266046Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9266223Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9266577Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9266700Z _lazy_init(state, module) 2022-11-23T02:58:19.9267057Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9267202Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9267547Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9267676Z return func(*args, **kwargs) 2022-11-23T02:58:19.9268058Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9268209Z p_assert( 2022-11-23T02:58:19.9268541Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9268671Z traceback.print_stack() 2022-11-23T02:58:19.9268921Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:58:19.9269381Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:58:19.9269794Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.9270200Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.9270530Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:58:19.9270775Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:58:19.9271181Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.9271564Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.9271808Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:58:19.9272044Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:58:19.9272441Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.9272848Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.9273092Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:58:19.9273327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:58:19.9273726Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.9274122Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.9274883Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9275113Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:58:19.9275355Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:58:19.9275753Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.9276152Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.9276392Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:58:19.9276630Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:58:19.9277025Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.9277429Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.9277727Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:58:19.9277969Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:58:19.9278348Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.9278746Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.9278987Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:58:19.9279223Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:58:19.9279671Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.9280070Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.9280315Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:58:19.9280551Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:58:19.9280943Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.9281322Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.9281561Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:58:19.9281798Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:58:19.9282192Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.9282591Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.9282830Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:58:19.9283068Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:58:19.9283460Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.9283856Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.9284082Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:58:19.9284317Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:58:19.9284713Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.9285108Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.9285344Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T02:58:19.9285580Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T02:58:19.9285975Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:58:19.9286371Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:58:19.9286615Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T02:58:19.9286895Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T02:58:19.9287277Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:58:19.9287673Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:58:19.9287912Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T02:58:19.9288149Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T02:58:19.9288543Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:58:19.9288999Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:58:19.9289239Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T02:58:19.9289488Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T02:58:19.9289886Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:58:19.9290261Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:58:19.9290512Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T02:58:19.9290753Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T02:58:19.9291155Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:58:19.9291548Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:58:19.9291794Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T02:58:19.9292033Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T02:58:19.9292431Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:58:19.9292823Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:58:19.9293070Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T02:58:19.9293294Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T02:58:19.9293692Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:58:19.9294086Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:58:19.9294327Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T02:58:19.9294564Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T02:58:19.9294961Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:58:19.9295354Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:58:19.9295596Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T02:58:19.9295838Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T02:58:19.9296264Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:58:19.9296670Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:58:19.9296911Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T02:58:19.9297149Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T02:58:19.9297546Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:58:19.9297938Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:58:19.9298228Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T02:58:19.9298473Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T02:58:19.9298872Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:58:19.9299246Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:58:19.9299491Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T02:58:19.9299728Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T02:58:19.9300121Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:58:19.9300517Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:58:19.9300631Z dist init r=1, world=2 2022-11-23T02:58:19.9300972Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9301300Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9301619Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9301933Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9302227Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9302539Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9302847Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9303155Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9303486Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9303810Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9304130Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9304492Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9304613Z dist init r=0, world=2 2022-11-23T02:58:19.9304930Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9305239Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9305547Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9305885Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9306198Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9306505Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9306811Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9307118Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9307428Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9307736Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9308048Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9308354Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9308457Z ok (23.942s) 2022-11-23T02:58:19.9308819Z test_mixture_of_experts_with_delay_before_free_offload_true_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90069 2022-11-23T02:58:19.9309236Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90070 2022-11-23T02:58:19.9309636Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.9309821Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.9310211Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.9310405Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.9310774Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.9310951Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.9311337Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.9311516Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.9311765Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.9312080Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.9312501Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.9312904Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.9313137Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.9313371Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.9314410Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.9314600Z warnings.warn( 2022-11-23T02:58:19.9314850Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:58:19.9315874Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.9315990Z warnings.warn( 2022-11-23T02:58:19.9316216Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:58:19.9316621Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.9316753Z File "", line 1, in 2022-11-23T02:58:19.9316970Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9317113Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9317321Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9317473Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9317689Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9317776Z self.run() 2022-11-23T02:58:19.9317982Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9318134Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9318486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9318623Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9318995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9319120Z getattr(self, test_name)() 2022-11-23T02:58:19.9319469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9319566Z fn() 2022-11-23T02:58:19.9319938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9320063Z test(self, **param_kwargs) 2022-11-23T02:58:19.9320424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9320556Z return func(*args, **kwargs) 2022-11-23T02:58:19.9320842Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9320954Z self.run_subtests( 2022-11-23T02:58:19.9321342Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9321514Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9321885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9322040Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9322418Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9322537Z output = model(*input) 2022-11-23T02:58:19.9322868Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9323058Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9323442Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9323608Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9323982Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9324104Z _lazy_init(state, module) 2022-11-23T02:58:19.9324461Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9324605Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9324949Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9325076Z return func(*args, **kwargs) 2022-11-23T02:58:19.9325467Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9325552Z p_assert( 2022-11-23T02:58:19.9325895Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9326027Z traceback.print_stack() 2022-11-23T02:58:19.9326436Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.9326567Z File "", line 1, in 2022-11-23T02:58:19.9326782Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9326928Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9327116Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9327268Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9327483Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9327592Z self.run() 2022-11-23T02:58:19.9327798Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9327946Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9328298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9328433Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9328784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9328909Z getattr(self, test_name)() 2022-11-23T02:58:19.9329275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9329374Z fn() 2022-11-23T02:58:19.9329746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9329874Z test(self, **param_kwargs) 2022-11-23T02:58:19.9330237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9330364Z return func(*args, **kwargs) 2022-11-23T02:58:19.9330729Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9330852Z self.run_subtests( 2022-11-23T02:58:19.9331216Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9331379Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9331749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9331903Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9332286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9332456Z output = model(*input) 2022-11-23T02:58:19.9332773Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9332921Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9333307Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9333484Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9333856Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9333981Z _lazy_init(state, module) 2022-11-23T02:58:19.9334339Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9334486Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9334814Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9334941Z return func(*args, **kwargs) 2022-11-23T02:58:19.9335331Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9335434Z p_assert( 2022-11-23T02:58:19.9335775Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9335904Z traceback.print_stack() 2022-11-23T02:58:19.9336154Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:58:19.9336404Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:58:19.9336792Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.9336925Z File "", line 1, in 2022-11-23T02:58:19.9337145Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9337288Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9337494Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9337652Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9337869Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9337973Z self.run() 2022-11-23T02:58:19.9338161Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9338309Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9338655Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9338789Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9339157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9339283Z getattr(self, test_name)() 2022-11-23T02:58:19.9339651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9339750Z fn() 2022-11-23T02:58:19.9340147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9340279Z test(self, **param_kwargs) 2022-11-23T02:58:19.9340649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9340775Z return func(*args, **kwargs) 2022-11-23T02:58:19.9341058Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9341173Z self.run_subtests( 2022-11-23T02:58:19.9341532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9341743Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9342095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9342253Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9342636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9342761Z output = model(*input) 2022-11-23T02:58:19.9343092Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9343234Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9343617Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9343795Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9344156Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9344279Z _lazy_init(state, module) 2022-11-23T02:58:19.9344641Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9344788Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9345133Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9345260Z return func(*args, **kwargs) 2022-11-23T02:58:19.9345644Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9345747Z p_assert( 2022-11-23T02:58:19.9346072Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9346201Z traceback.print_stack() 2022-11-23T02:58:19.9346611Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.9346741Z File "", line 1, in 2022-11-23T02:58:19.9346956Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9347103Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9347307Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9347459Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9347657Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9347761Z self.run() 2022-11-23T02:58:19.9347964Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9348113Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9348459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9348598Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9349183Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9349300Z getattr(self, test_name)() 2022-11-23T02:58:19.9349741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9349848Z fn() 2022-11-23T02:58:19.9350252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9350378Z test(self, **param_kwargs) 2022-11-23T02:58:19.9350740Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9350866Z return func(*args, **kwargs) 2022-11-23T02:58:19.9351152Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9351309Z self.run_subtests( 2022-11-23T02:58:19.9351675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9351841Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9352210Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9352364Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9352748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9352870Z output = model(*input) 2022-11-23T02:58:19.9353200Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9353323Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9353705Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9353889Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9354262Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9354385Z _lazy_init(state, module) 2022-11-23T02:58:19.9354742Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9354886Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9355229Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9355336Z return func(*args, **kwargs) 2022-11-23T02:58:19.9355717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9355817Z p_assert( 2022-11-23T02:58:19.9356163Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9356290Z traceback.print_stack() 2022-11-23T02:58:19.9356538Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:58:19.9356789Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:58:19.9357196Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.9357598Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.9357711Z File "", line 1, in 2022-11-23T02:58:19.9357927Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9358070Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9358280Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9358434Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9358652Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9358755Z self.run() 2022-11-23T02:58:19.9358992Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9359143Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9359491Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9359626Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9359993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9360118Z getattr(self, test_name)() 2022-11-23T02:58:19.9360481Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9360625Z fn() 2022-11-23T02:58:19.9360980Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9361108Z test(self, **param_kwargs) 2022-11-23T02:58:19.9361475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9361602Z return func(*args, **kwargs) 2022-11-23T02:58:19.9361886Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9362001Z self.run_subtests( 2022-11-23T02:58:19.9362360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9362523Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9362877Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9363037Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9363418Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9363541Z output = model(*input) 2022-11-23T02:58:19.9363878Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9364018Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9364400Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9364580Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9364932Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9365054Z _lazy_init(state, module) 2022-11-23T02:58:19.9365415Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9365560Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9365906Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9366034Z return func(*args, **kwargs) 2022-11-23T02:58:19.9366421Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9366523Z p_assert( 2022-11-23T02:58:19.9366847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9366974Z traceback.print_stack() 2022-11-23T02:58:19.9367106Z File "", line 1, in 2022-11-23T02:58:19.9367318Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9367460Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9367669Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9367822Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9368040Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9368175Z self.run() 2022-11-23T02:58:19.9368389Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9368539Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9368888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9369025Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9369391Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9369517Z getattr(self, test_name)() 2022-11-23T02:58:19.9369864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9370012Z fn() 2022-11-23T02:58:19.9370385Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9370509Z test(self, **param_kwargs) 2022-11-23T02:58:19.9370874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9371001Z return func(*args, **kwargs) 2022-11-23T02:58:19.9371286Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9371400Z self.run_subtests( 2022-11-23T02:58:19.9371742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9371909Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9372280Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9372439Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9372823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9372946Z output = model(*input) 2022-11-23T02:58:19.9373278Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9373420Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9373780Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9373962Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9374336Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9374461Z _lazy_init(state, module) 2022-11-23T02:58:19.9374816Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9374961Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9375308Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9375434Z return func(*args, **kwargs) 2022-11-23T02:58:19.9375822Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9375907Z p_assert( 2022-11-23T02:58:19.9376249Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9376377Z traceback.print_stack() 2022-11-23T02:58:19.9376627Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:58:19.9376875Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:58:19.9377287Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.9377417Z File "", line 1, in 2022-11-23T02:58:19.9377684Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9377816Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9378026Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9378180Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9378400Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9378505Z self.run() 2022-11-23T02:58:19.9378711Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9378862Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9379192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9379382Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9379755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9379884Z getattr(self, test_name)() 2022-11-23T02:58:19.9380249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9380347Z fn() 2022-11-23T02:58:19.9380720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9380844Z test(self, **param_kwargs) 2022-11-23T02:58:19.9381186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9381316Z return func(*args, **kwargs) 2022-11-23T02:58:19.9381599Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9381717Z self.run_subtests( 2022-11-23T02:58:19.9382076Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9382244Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9382614Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9382767Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9383131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9383253Z output = model(*input) 2022-11-23T02:58:19.9383583Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9383724Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9384113Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9384291Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9384664Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9384786Z _lazy_init(state, module) 2022-11-23T02:58:19.9385127Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9385273Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9385615Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9385741Z return func(*args, **kwargs) 2022-11-23T02:58:19.9386126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9386232Z p_assert( 2022-11-23T02:58:19.9386573Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9386702Z traceback.print_stack() 2022-11-23T02:58:19.9387136Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.9387274Z File "", line 1, in 2022-11-23T02:58:19.9387487Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9387630Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9387841Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9387995Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9388211Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9388316Z self.run() 2022-11-23T02:58:19.9388563Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9388713Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9389282Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9389428Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9389804Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9389930Z getattr(self, test_name)() 2022-11-23T02:58:19.9390297Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9390376Z fn() 2022-11-23T02:58:19.9390747Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9390873Z test(self, **param_kwargs) 2022-11-23T02:58:19.9391234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9391365Z return func(*args, **kwargs) 2022-11-23T02:58:19.9391648Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9391766Z self.run_subtests( 2022-11-23T02:58:19.9392126Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9392272Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9392643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9392799Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9393179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9393299Z output = model(*input) 2022-11-23T02:58:19.9393635Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9393778Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9394162Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9394339Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9394696Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9394819Z _lazy_init(state, module) 2022-11-23T02:58:19.9395176Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9395322Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9395665Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9395794Z return func(*args, **kwargs) 2022-11-23T02:58:19.9396179Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9396282Z p_assert( 2022-11-23T02:58:19.9396678Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9396821Z traceback.print_stack() 2022-11-23T02:58:19.9397070Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:58:19.9397318Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:58:19.9397726Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.9397859Z File "", line 1, in 2022-11-23T02:58:19.9398077Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9398282Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9398471Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9398626Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9398847Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9398954Z self.run() 2022-11-23T02:58:19.9399160Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9399308Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9399658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9399775Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9400144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9400268Z getattr(self, test_name)() 2022-11-23T02:58:19.9400638Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9400737Z fn() 2022-11-23T02:58:19.9401109Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9401237Z test(self, **param_kwargs) 2022-11-23T02:58:19.9401602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9401710Z return func(*args, **kwargs) 2022-11-23T02:58:19.9401991Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9402107Z self.run_subtests( 2022-11-23T02:58:19.9402464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9402626Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9402997Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9403151Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9403535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9403637Z output = model(*input) 2022-11-23T02:58:19.9403967Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9404112Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9404496Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9404675Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9405045Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9405175Z _lazy_init(state, module) 2022-11-23T02:58:19.9405533Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9405659Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9406048Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9406182Z return func(*args, **kwargs) 2022-11-23T02:58:19.9406569Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9406672Z p_assert( 2022-11-23T02:58:19.9407013Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9407142Z traceback.print_stack() 2022-11-23T02:58:19.9407548Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.9407707Z File "", line 1, in 2022-11-23T02:58:19.9407927Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9408070Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9408281Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9408432Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9408648Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9408753Z self.run() 2022-11-23T02:58:19.9408957Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9409087Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9409436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9409570Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9409944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9410068Z getattr(self, test_name)() 2022-11-23T02:58:19.9410434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9410533Z fn() 2022-11-23T02:58:19.9410904Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9411010Z test(self, **param_kwargs) 2022-11-23T02:58:19.9411369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9411495Z return func(*args, **kwargs) 2022-11-23T02:58:19.9411781Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9411893Z self.run_subtests( 2022-11-23T02:58:19.9412258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9412422Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9412795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9412932Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9413315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9413434Z output = model(*input) 2022-11-23T02:58:19.9413767Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9413911Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9414292Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9414475Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9414848Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9414953Z _lazy_init(state, module) 2022-11-23T02:58:19.9415356Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9415509Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9415853Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9415979Z return func(*args, **kwargs) 2022-11-23T02:58:19.9416361Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9416465Z p_assert( 2022-11-23T02:58:19.9416804Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9416958Z traceback.print_stack() 2022-11-23T02:58:19.9417207Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:58:19.9417457Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:58:19.9417866Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.9417998Z File "", line 1, in 2022-11-23T02:58:19.9418215Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9418360Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9418568Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9418705Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9418923Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9419033Z self.run() 2022-11-23T02:58:19.9419240Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9419387Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9419737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9419874Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9420222Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9420347Z getattr(self, test_name)() 2022-11-23T02:58:19.9420713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9420812Z fn() 2022-11-23T02:58:19.9421181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9421306Z test(self, **param_kwargs) 2022-11-23T02:58:19.9421673Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9421800Z return func(*args, **kwargs) 2022-11-23T02:58:19.9422069Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9422185Z self.run_subtests( 2022-11-23T02:58:19.9422545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9422709Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9423080Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9423242Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9423622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9423746Z output = model(*input) 2022-11-23T02:58:19.9424056Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9424200Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9424629Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9424814Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9425188Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9425310Z _lazy_init(state, module) 2022-11-23T02:58:19.9425669Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9425815Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9426157Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9426312Z return func(*args, **kwargs) 2022-11-23T02:58:19.9426696Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9426800Z p_assert( 2022-11-23T02:58:19.9427144Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9427273Z traceback.print_stack() 2022-11-23T02:58:19.9427680Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.9427813Z File "", line 1, in 2022-11-23T02:58:19.9428010Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9428154Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9428360Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9428517Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9428734Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9428838Z self.run() 2022-11-23T02:58:19.9429263Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9429421Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9429754Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9429890Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9430258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9430384Z getattr(self, test_name)() 2022-11-23T02:58:19.9430750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9430849Z fn() 2022-11-23T02:58:19.9431221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9431345Z test(self, **param_kwargs) 2022-11-23T02:58:19.9431690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9431819Z return func(*args, **kwargs) 2022-11-23T02:58:19.9432102Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9432218Z self.run_subtests( 2022-11-23T02:58:19.9432578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9432741Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9433113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9433272Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9433637Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9433760Z output = model(*input) 2022-11-23T02:58:19.9434162Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9434318Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9434706Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9434884Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9435258Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9435380Z _lazy_init(state, module) 2022-11-23T02:58:19.9435717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9435930Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9436275Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9436403Z return func(*args, **kwargs) 2022-11-23T02:58:19.9436794Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9436899Z p_assert( 2022-11-23T02:58:19.9437243Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9437371Z traceback.print_stack() 2022-11-23T02:58:19.9437602Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:58:19.9437851Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:58:19.9438260Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.9438397Z File "", line 1, in 2022-11-23T02:58:19.9438612Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9438760Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9438968Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9439121Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9439319Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9439424Z self.run() 2022-11-23T02:58:19.9439631Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9439780Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9440128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9440266Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9440636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9440761Z getattr(self, test_name)() 2022-11-23T02:58:19.9441113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9441213Z fn() 2022-11-23T02:58:19.9441585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9441712Z test(self, **param_kwargs) 2022-11-23T02:58:19.9442075Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9442202Z return func(*args, **kwargs) 2022-11-23T02:58:19.9442487Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9442607Z self.run_subtests( 2022-11-23T02:58:19.9442946Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9443113Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9443532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9443693Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9444077Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9444199Z output = model(*input) 2022-11-23T02:58:19.9444529Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9444671Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9445033Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9445274Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9445647Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9445774Z _lazy_init(state, module) 2022-11-23T02:58:19.9446135Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9446280Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9446624Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9446751Z return func(*args, **kwargs) 2022-11-23T02:58:19.9447113Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9447219Z p_assert( 2022-11-23T02:58:19.9447560Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9447693Z traceback.print_stack() 2022-11-23T02:58:19.9448096Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.9448229Z File "", line 1, in 2022-11-23T02:58:19.9448445Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9448590Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9448778Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9448930Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9449145Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9449252Z self.run() 2022-11-23T02:58:19.9449457Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9449603Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9449954Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9450069Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9450492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9450618Z getattr(self, test_name)() 2022-11-23T02:58:19.9450985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9451084Z fn() 2022-11-23T02:58:19.9451453Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9451577Z test(self, **param_kwargs) 2022-11-23T02:58:19.9451938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9452045Z return func(*args, **kwargs) 2022-11-23T02:58:19.9452331Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9452446Z self.run_subtests( 2022-11-23T02:58:19.9452852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9453021Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9453393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9453547Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9453930Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9454033Z output = model(*input) 2022-11-23T02:58:19.9454366Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9454559Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9454942Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9455124Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9455497Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9455619Z _lazy_init(state, module) 2022-11-23T02:58:19.9455974Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9456101Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9465301Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9465481Z return func(*args, **kwargs) 2022-11-23T02:58:19.9465939Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9466053Z p_assert( 2022-11-23T02:58:19.9466407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9466534Z traceback.print_stack() 2022-11-23T02:58:19.9466790Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:58:19.9467039Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:58:19.9467450Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.9467564Z File "", line 1, in 2022-11-23T02:58:19.9467779Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9467923Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9468131Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9468287Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9468504Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9468607Z self.run() 2022-11-23T02:58:19.9468797Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9469265Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9469644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9469779Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9470150Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9470272Z getattr(self, test_name)() 2022-11-23T02:58:19.9470637Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9470740Z fn() 2022-11-23T02:58:19.9471099Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9471220Z test(self, **param_kwargs) 2022-11-23T02:58:19.9471713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9471854Z return func(*args, **kwargs) 2022-11-23T02:58:19.9472142Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9472257Z self.run_subtests( 2022-11-23T02:58:19.9472618Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9472781Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9473132Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9473356Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9473742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9473866Z output = model(*input) 2022-11-23T02:58:19.9474196Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9474337Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9474717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9474894Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9475252Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9475373Z _lazy_init(state, module) 2022-11-23T02:58:19.9475735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9475880Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9476223Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9476354Z return func(*args, **kwargs) 2022-11-23T02:58:19.9476736Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9476838Z p_assert( 2022-11-23T02:58:19.9477164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9477290Z traceback.print_stack() 2022-11-23T02:58:19.9477693Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.9477824Z File "", line 1, in 2022-11-23T02:58:19.9478042Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9478185Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9478391Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9478545Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9478745Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9478848Z self.run() 2022-11-23T02:58:19.9479052Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9479198Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9479545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9479677Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9480042Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9480167Z getattr(self, test_name)() 2022-11-23T02:58:19.9480516Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9480613Z fn() 2022-11-23T02:58:19.9481037Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9481169Z test(self, **param_kwargs) 2022-11-23T02:58:19.9481530Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9481655Z return func(*args, **kwargs) 2022-11-23T02:58:19.9481939Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9482051Z self.run_subtests( 2022-11-23T02:58:19.9482393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9482642Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9483014Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9483168Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9483551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9483671Z output = model(*input) 2022-11-23T02:58:19.9484062Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9484208Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9484573Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9484752Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9485123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9485248Z _lazy_init(state, module) 2022-11-23T02:58:19.9485603Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9485749Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9486091Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9486215Z return func(*args, **kwargs) 2022-11-23T02:58:19.9486578Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9486680Z p_assert( 2022-11-23T02:58:19.9487022Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9487149Z traceback.print_stack() 2022-11-23T02:58:19.9487399Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:58:19.9487645Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:58:19.9488054Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.9488187Z File "", line 1, in 2022-11-23T02:58:19.9488383Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9488525Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9488730Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9488882Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9489096Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9489200Z self.run() 2022-11-23T02:58:19.9489404Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9489554Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9489885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9490017Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9490431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9490560Z getattr(self, test_name)() 2022-11-23T02:58:19.9490925Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9491022Z fn() 2022-11-23T02:58:19.9491390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9491496Z test(self, **param_kwargs) 2022-11-23T02:58:19.9491856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9492027Z return func(*args, **kwargs) 2022-11-23T02:58:19.9492309Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9492423Z self.run_subtests( 2022-11-23T02:58:19.9492785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9492947Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9493314Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9493451Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9493830Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9493950Z output = model(*input) 2022-11-23T02:58:19.9494280Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9494428Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9494805Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9494985Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9495359Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9495481Z _lazy_init(state, module) 2022-11-23T02:58:19.9495819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9495962Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9496303Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9496428Z return func(*args, **kwargs) 2022-11-23T02:58:19.9496817Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9496920Z p_assert( 2022-11-23T02:58:19.9497263Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9497389Z traceback.print_stack() 2022-11-23T02:58:19.9497779Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.9497910Z File "", line 1, in 2022-11-23T02:58:19.9498126Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9498268Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9498473Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9498626Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9498844Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9498934Z self.run() 2022-11-23T02:58:19.9499138Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9499284Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9499673Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9499813Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9500181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9500305Z getattr(self, test_name)() 2022-11-23T02:58:19.9500667Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9500748Z fn() 2022-11-23T02:58:19.9501118Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9501287Z test(self, **param_kwargs) 2022-11-23T02:58:19.9501650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9501776Z return func(*args, **kwargs) 2022-11-23T02:58:19.9502064Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9502177Z self.run_subtests( 2022-11-23T02:58:19.9502534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9502680Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9503049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9503201Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9503583Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9503706Z output = model(*input) 2022-11-23T02:58:19.9504037Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9504182Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9504563Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9504723Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9505096Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9505217Z _lazy_init(state, module) 2022-11-23T02:58:19.9505570Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9505713Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9506055Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9506269Z return func(*args, **kwargs) 2022-11-23T02:58:19.9506988Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9507144Z p_assert( 2022-11-23T02:58:19.9507791Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9508012Z traceback.print_stack() 2022-11-23T02:58:19.9508445Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:58:19.9508773Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:58:19.9509705Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.9509853Z File "", line 1, in 2022-11-23T02:58:19.9510072Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9510199Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9510509Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9510678Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9510894Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9510999Z self.run() 2022-11-23T02:58:19.9511206Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9511355Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9511712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9511828Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9512194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9512385Z getattr(self, test_name)() 2022-11-23T02:58:19.9512758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9512858Z fn() 2022-11-23T02:58:19.9513231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9513353Z test(self, **param_kwargs) 2022-11-23T02:58:19.9513717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9513825Z return func(*args, **kwargs) 2022-11-23T02:58:19.9514105Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9514219Z self.run_subtests( 2022-11-23T02:58:19.9514578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9514745Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9515116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9515272Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9515653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9515756Z output = model(*input) 2022-11-23T02:58:19.9516087Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9516231Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9516613Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9516789Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9517166Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9517289Z _lazy_init(state, module) 2022-11-23T02:58:19.9517647Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9517775Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9518118Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9518242Z return func(*args, **kwargs) 2022-11-23T02:58:19.9518626Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9518729Z p_assert( 2022-11-23T02:58:19.9519069Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9519197Z traceback.print_stack() 2022-11-23T02:58:19.9519610Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.9519724Z File "", line 1, in 2022-11-23T02:58:19.9519985Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9520134Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9520339Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9520492Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9520708Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9520812Z self.run() 2022-11-23T02:58:19.9521001Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9521149Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9521496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9521689Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9522060Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9522188Z getattr(self, test_name)() 2022-11-23T02:58:19.9522553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9522651Z fn() 2022-11-23T02:58:19.9523003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9523126Z test(self, **param_kwargs) 2022-11-23T02:58:19.9523484Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9523609Z return func(*args, **kwargs) 2022-11-23T02:58:19.9523894Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9524011Z self.run_subtests( 2022-11-23T02:58:19.9524368Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9524533Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9524885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9525038Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9525417Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9525535Z output = model(*input) 2022-11-23T02:58:19.9525866Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9526011Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9526395Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9526571Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9526931Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9527052Z _lazy_init(state, module) 2022-11-23T02:58:19.9527406Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9527548Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9527888Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9528013Z return func(*args, **kwargs) 2022-11-23T02:58:19.9528398Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9528504Z p_assert( 2022-11-23T02:58:19.9528827Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9528952Z traceback.print_stack() 2022-11-23T02:58:19.9529253Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:58:19.9529502Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:58:19.9529912Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.9530042Z File "", line 1, in 2022-11-23T02:58:19.9530256Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9530400Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9530588Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9530739Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9531011Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9531115Z self.run() 2022-11-23T02:58:19.9531322Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9531473Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9531819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9531951Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9532301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9532424Z getattr(self, test_name)() 2022-11-23T02:58:19.9532787Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9532884Z fn() 2022-11-23T02:58:19.9533256Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9533382Z test(self, **param_kwargs) 2022-11-23T02:58:19.9533745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9533874Z return func(*args, **kwargs) 2022-11-23T02:58:19.9534139Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9534252Z self.run_subtests( 2022-11-23T02:58:19.9534607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9534772Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9535140Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9535294Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9535682Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9535802Z output = model(*input) 2022-11-23T02:58:19.9536116Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9536256Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9536639Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9536817Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9537187Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9537308Z _lazy_init(state, module) 2022-11-23T02:58:19.9537662Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9537809Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9538135Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9538261Z return func(*args, **kwargs) 2022-11-23T02:58:19.9538690Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9538799Z p_assert( 2022-11-23T02:58:19.9539145Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9539272Z traceback.print_stack() 2022-11-23T02:58:19.9539674Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.9539803Z File "", line 1, in 2022-11-23T02:58:19.9539999Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9540189Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9540395Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9540547Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9540766Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9540864Z self.run() 2022-11-23T02:58:19.9541072Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9541202Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9541542Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9541672Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9542036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9542161Z getattr(self, test_name)() 2022-11-23T02:58:19.9542524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9542625Z fn() 2022-11-23T02:58:19.9542984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9543092Z test(self, **param_kwargs) 2022-11-23T02:58:19.9543452Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9543571Z return func(*args, **kwargs) 2022-11-23T02:58:19.9543844Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9543952Z self.run_subtests( 2022-11-23T02:58:19.9544310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9544474Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9544842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9544977Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9545352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9545463Z output = model(*input) 2022-11-23T02:58:19.9545785Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9545921Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9546301Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9546478Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9546852Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9546975Z _lazy_init(state, module) 2022-11-23T02:58:19.9547315Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9547459Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9547847Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9547977Z return func(*args, **kwargs) 2022-11-23T02:58:19.9548365Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9548467Z p_assert( 2022-11-23T02:58:19.9548807Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9548916Z traceback.print_stack() 2022-11-23T02:58:19.9549822Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:58:19.9550072Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:58:19.9550628Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.9550759Z File "", line 1, in 2022-11-23T02:58:19.9550976Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9551120Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9551325Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9551459Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9551674Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9551777Z self.run() 2022-11-23T02:58:19.9551982Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9552130Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9552483Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9552618Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9552987Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9553093Z getattr(self, test_name)() 2022-11-23T02:58:19.9553453Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9553551Z fn() 2022-11-23T02:58:19.9553920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9554041Z test(self, **param_kwargs) 2022-11-23T02:58:19.9554400Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9554527Z return func(*args, **kwargs) 2022-11-23T02:58:19.9554814Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9554909Z self.run_subtests( 2022-11-23T02:58:19.9555270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9555430Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9555801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9555954Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9556335Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9556455Z output = model(*input) 2022-11-23T02:58:19.9556782Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9556908Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9557287Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9557463Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9557898Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9558029Z _lazy_init(state, module) 2022-11-23T02:58:19.9558389Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9558533Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9558875Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9558982Z return func(*args, **kwargs) 2022-11-23T02:58:19.9559367Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9559525Z p_assert( 2022-11-23T02:58:19.9559870Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9559997Z traceback.print_stack() 2022-11-23T02:58:19.9560405Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.9560537Z File "", line 1, in 2022-11-23T02:58:19.9560751Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9560875Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9561083Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9561237Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9561451Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9561553Z self.run() 2022-11-23T02:58:19.9561760Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9561908Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9562252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9562370Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9562737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9562859Z getattr(self, test_name)() 2022-11-23T02:58:19.9563223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9563320Z fn() 2022-11-23T02:58:19.9563688Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9563812Z test(self, **param_kwargs) 2022-11-23T02:58:19.9564162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9564287Z return func(*args, **kwargs) 2022-11-23T02:58:19.9564573Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9564686Z self.run_subtests( 2022-11-23T02:58:19.9565044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9565205Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9565572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9565726Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9566108Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9566215Z output = model(*input) 2022-11-23T02:58:19.9566547Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9566689Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9567119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9567304Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9567677Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9567797Z _lazy_init(state, module) 2022-11-23T02:58:19.9568150Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9568276Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9568620Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9568793Z return func(*args, **kwargs) 2022-11-23T02:58:19.9569178Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9569280Z p_assert( 2022-11-23T02:58:19.9569625Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9569751Z traceback.print_stack() 2022-11-23T02:58:19.9570000Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:58:19.9570224Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:58:19.9570631Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.9571034Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.9571283Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:58:19.9571523Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:58:19.9571926Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.9572329Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.9572574Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:58:19.9572812Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:58:19.9573192Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.9573590Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.9573839Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:58:19.9574084Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:58:19.9574482Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.9574876Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.9575639Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9575888Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:58:19.9576124Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:58:19.9576568Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.9576956Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.9577713Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9577959Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:58:19.9578248Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:58:19.9578646Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.9579046Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.9579292Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:58:19.9579690Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.9579931Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:58:19.9580329Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.9580556Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:58:19.9580795Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:58:19.9581193Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.9581590Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.9581834Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:58:19.9582071Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:58:19.9582469Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.9582867Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.9583107Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:58:19.9583347Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:58:19.9583728Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.9584119Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.9584362Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:58:19.9584599Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:58:19.9584996Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.9585391Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.9586215Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9586469Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:58:19.9586708Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:58:19.9587106Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.9587483Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.9587798Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T02:58:19.9588041Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T02:58:19.9588443Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:58:19.9588838Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:58:19.9589703Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T02:58:19.9589964Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T02:58:19.9590369Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:58:19.9590772Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:58:19.9591001Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T02:58:19.9591237Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T02:58:19.9591634Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:58:19.9592028Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:58:19.9592268Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T02:58:19.9592506Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T02:58:19.9592904Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:58:19.9593303Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:58:19.9593542Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T02:58:19.9593778Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T02:58:19.9594156Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:58:19.9594549Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:58:19.9595305Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9595637Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T02:58:19.9595888Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T02:58:19.9596288Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:58:19.9596683Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:58:19.9596920Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T02:58:19.9597310Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:58:19.9597621Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T02:58:19.9598002Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:58:19.9598756Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9598999Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T02:58:19.9599236Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T02:58:19.9599630Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:58:19.9600026Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:58:19.9600276Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T02:58:19.9600518Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T02:58:19.9600917Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:58:19.9601307Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:58:19.9601532Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T02:58:19.9601769Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T02:58:19.9602171Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:58:19.9602567Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:58:19.9602809Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T02:58:19.9603205Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:58:19.9603448Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T02:58:19.9603843Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:58:19.9604604Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9604897Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T02:58:19.9605138Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T02:58:19.9605518Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:58:19.9605912Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:58:19.9606026Z dist init r=0, world=2 2022-11-23T02:58:19.9606360Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9606737Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9607059Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9607374Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9607684Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9607993Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9608302Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9608595Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9608905Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9609212Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9609519Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9609826Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9609942Z dist init r=1, world=2 2022-11-23T02:58:19.9610276Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9610600Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9610914Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9611223Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9611529Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9611838Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9612171Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9612483Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9612788Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9613093Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9613442Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9613751Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9613853Z ok (31.254s) 2022-11-23T02:58:19.9614226Z test_mixture_of_experts_with_delay_before_free_offload_true_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90416 2022-11-23T02:58:19.9614453Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90417 2022-11-23T02:58:19.9614838Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.9614997Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.9615380Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.9615577Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.9615955Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.9616131Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.9616516Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.9616708Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.9616955Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.9617201Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.9617586Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.9617991Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.9618227Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.9618459Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.9619492Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.9619606Z warnings.warn( 2022-11-23T02:58:19.9619853Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 0 2022-11-23T02:58:19.9620919Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.9621035Z warnings.warn( 2022-11-23T02:58:19.9621276Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:2 to store for rank: 1 2022-11-23T02:58:19.9621679Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.9622059Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 2 nodes. 2022-11-23T02:58:19.9622238Z File "", line 1, in 2022-11-23T02:58:19.9622455Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9622600Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9622810Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9622962Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9623180Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9623285Z self.run() 2022-11-23T02:58:19.9623475Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9623622Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9623973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9624107Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9624478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9624602Z getattr(self, test_name)() 2022-11-23T02:58:19.9624969Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9625067Z fn() 2022-11-23T02:58:19.9625421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9625544Z test(self, **param_kwargs) 2022-11-23T02:58:19.9625905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9626030Z return func(*args, **kwargs) 2022-11-23T02:58:19.9626312Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9626430Z self.run_subtests( 2022-11-23T02:58:19.9626789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9626953Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9627309Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9627463Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9627841Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9627960Z output = model(*input) 2022-11-23T02:58:19.9628291Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9628434Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9628812Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9629572Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9630016Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9630238Z _lazy_init(state, module) 2022-11-23T02:58:19.9630609Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9630753Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9631096Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9631223Z return func(*args, **kwargs) 2022-11-23T02:58:19.9631608Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9631710Z p_assert( 2022-11-23T02:58:19.9632032Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9632222Z traceback.print_stack() 2022-11-23T02:58:19.9632353Z File "", line 1, in 2022-11-23T02:58:19.9632569Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9632718Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9632921Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9633074Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9633273Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9633377Z self.run() 2022-11-23T02:58:19.9633581Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9633727Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9634073Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9634210Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9634577Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9634701Z getattr(self, test_name)() 2022-11-23T02:58:19.9635054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9635151Z fn() 2022-11-23T02:58:19.9635520Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9635642Z test(self, **param_kwargs) 2022-11-23T02:58:19.9636002Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9636127Z return func(*args, **kwargs) 2022-11-23T02:58:19.9636411Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9636528Z self.run_subtests( 2022-11-23T02:58:19.9636872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9637034Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9637404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9637558Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9637937Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9638056Z output = model(*input) 2022-11-23T02:58:19.9638389Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9638531Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9638893Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9639072Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9639446Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9639614Z _lazy_init(state, module) 2022-11-23T02:58:19.9639978Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9640123Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9640464Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9640589Z return func(*args, **kwargs) 2022-11-23T02:58:19.9640957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9641060Z p_assert( 2022-11-23T02:58:19.9641397Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9641571Z traceback.print_stack() 2022-11-23T02:58:19.9641821Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 0 2022-11-23T02:58:19.9642071Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:3 to store for rank: 1 2022-11-23T02:58:19.9642480Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.9642610Z File "", line 1, in 2022-11-23T02:58:19.9642809Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9642953Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9643161Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9643313Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9643532Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9643636Z self.run() 2022-11-23T02:58:19.9643840Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9643987Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9644320Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9644454Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9644822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9644945Z getattr(self, test_name)() 2022-11-23T02:58:19.9645307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9645405Z fn() 2022-11-23T02:58:19.9645777Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9645902Z test(self, **param_kwargs) 2022-11-23T02:58:19.9646249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9646374Z return func(*args, **kwargs) 2022-11-23T02:58:19.9646659Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9646773Z self.run_subtests( 2022-11-23T02:58:19.9647130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9647295Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9647662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9647814Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9648179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9648304Z output = model(*input) 2022-11-23T02:58:19.9648637Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9648830Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9649224Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9649401Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9649773Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9649894Z _lazy_init(state, module) 2022-11-23T02:58:19.9650276Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9650425Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9650842Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9650968Z return func(*args, **kwargs) 2022-11-23T02:58:19.9651357Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9651459Z p_assert( 2022-11-23T02:58:19.9651798Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9651925Z traceback.print_stack() 2022-11-23T02:58:19.9652314Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:3 with 2 nodes. 2022-11-23T02:58:19.9652443Z File "", line 1, in 2022-11-23T02:58:19.9652655Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9652799Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9653007Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9653157Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9653371Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9653457Z self.run() 2022-11-23T02:58:19.9653667Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9653814Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9654160Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9654292Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9654657Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9654780Z getattr(self, test_name)() 2022-11-23T02:58:19.9655142Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9655227Z fn() 2022-11-23T02:58:19.9655596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9655719Z test(self, **param_kwargs) 2022-11-23T02:58:19.9656083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9656207Z return func(*args, **kwargs) 2022-11-23T02:58:19.9656487Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9656600Z self.run_subtests( 2022-11-23T02:58:19.9656956Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9657102Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9657470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9657627Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9658009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9658188Z output = model(*input) 2022-11-23T02:58:19.9658528Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9658669Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9659049Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9659211Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9659582Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9659702Z _lazy_init(state, module) 2022-11-23T02:58:19.9660110Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9660254Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9660599Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9660725Z return func(*args, **kwargs) 2022-11-23T02:58:19.9661107Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9661192Z p_assert( 2022-11-23T02:58:19.9661534Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9661660Z traceback.print_stack() 2022-11-23T02:58:19.9661906Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 0 2022-11-23T02:58:19.9662153Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:4 to store for rank: 1 2022-11-23T02:58:19.9662559Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.9662689Z File "", line 1, in 2022-11-23T02:58:19.9662905Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9663031Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9663235Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9663385Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9663602Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9663705Z self.run() 2022-11-23T02:58:19.9663910Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9664056Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9664404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9664524Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9664892Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9665019Z getattr(self, test_name)() 2022-11-23T02:58:19.9665383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9665481Z fn() 2022-11-23T02:58:19.9665848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9665971Z test(self, **param_kwargs) 2022-11-23T02:58:19.9666331Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9666439Z return func(*args, **kwargs) 2022-11-23T02:58:19.9666721Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9666838Z self.run_subtests( 2022-11-23T02:58:19.9667198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9667408Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9667788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9667942Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9668321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9668423Z output = model(*input) 2022-11-23T02:58:19.9668754Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9668896Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9669972Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9670151Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9670530Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9670651Z _lazy_init(state, module) 2022-11-23T02:58:19.9671009Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9671134Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9671479Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9671604Z return func(*args, **kwargs) 2022-11-23T02:58:19.9671989Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9672095Z p_assert( 2022-11-23T02:58:19.9672438Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9672566Z traceback.print_stack() 2022-11-23T02:58:19.9672977Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:4 with 2 nodes. 2022-11-23T02:58:19.9673090Z File "", line 1, in 2022-11-23T02:58:19.9673306Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9673449Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9673655Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9673807Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9674024Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9674127Z self.run() 2022-11-23T02:58:19.9674314Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9674466Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9674812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9674945Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9675315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9675439Z getattr(self, test_name)() 2022-11-23T02:58:19.9675803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9675902Z fn() 2022-11-23T02:58:19.9676255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9676378Z test(self, **param_kwargs) 2022-11-23T02:58:19.9676740Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9676869Z return func(*args, **kwargs) 2022-11-23T02:58:19.9677150Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9677265Z self.run_subtests( 2022-11-23T02:58:19.9677705Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9677884Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9678237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9678391Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9678771Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9678891Z output = model(*input) 2022-11-23T02:58:19.9679286Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9679429Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9679811Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9679992Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9680364Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9680469Z _lazy_init(state, module) 2022-11-23T02:58:19.9680825Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9680968Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9681311Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9681436Z return func(*args, **kwargs) 2022-11-23T02:58:19.9681824Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9681927Z p_assert( 2022-11-23T02:58:19.9682254Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9682382Z traceback.print_stack() 2022-11-23T02:58:19.9682631Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 0 2022-11-23T02:58:19.9682881Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:5 to store for rank: 1 2022-11-23T02:58:19.9683285Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.9683417Z File "", line 1, in 2022-11-23T02:58:19.9683633Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9683780Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9683968Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9684120Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9684340Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9684444Z self.run() 2022-11-23T02:58:19.9684648Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9684795Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9685143Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9685277Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9685627Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9685750Z getattr(self, test_name)() 2022-11-23T02:58:19.9686115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9686217Z fn() 2022-11-23T02:58:19.9686587Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9686761Z test(self, **param_kwargs) 2022-11-23T02:58:19.9687134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9687259Z return func(*args, **kwargs) 2022-11-23T02:58:19.9687527Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9687640Z self.run_subtests( 2022-11-23T02:58:19.9687998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9688161Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9688583Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9688738Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9689125Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9689246Z output = model(*input) 2022-11-23T02:58:19.9689561Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9689703Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9690082Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9690257Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9690628Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9690753Z _lazy_init(state, module) 2022-11-23T02:58:19.9691114Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9691258Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9691586Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9691712Z return func(*args, **kwargs) 2022-11-23T02:58:19.9692096Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9692198Z p_assert( 2022-11-23T02:58:19.9692539Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9692666Z traceback.print_stack() 2022-11-23T02:58:19.9693072Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:5 with 2 nodes. 2022-11-23T02:58:19.9693206Z File "", line 1, in 2022-11-23T02:58:19.9693402Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9693544Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9693753Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9693908Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9694123Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9694227Z self.run() 2022-11-23T02:58:19.9694432Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9694579Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9694910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9695044Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9695410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9695537Z getattr(self, test_name)() 2022-11-23T02:58:19.9695900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9696046Z fn() 2022-11-23T02:58:19.9696425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9696530Z test(self, **param_kwargs) 2022-11-23T02:58:19.9696891Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9697016Z return func(*args, **kwargs) 2022-11-23T02:58:19.9697298Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9697412Z self.run_subtests( 2022-11-23T02:58:19.9697771Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9697982Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9698355Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9698512Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9698877Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9698995Z output = model(*input) 2022-11-23T02:58:19.9699323Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9699464Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9699843Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9700022Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9700399Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9700520Z _lazy_init(state, module) 2022-11-23T02:58:19.9700861Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9701005Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9701349Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9701476Z return func(*args, **kwargs) 2022-11-23T02:58:19.9701863Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9701966Z p_assert( 2022-11-23T02:58:19.9702306Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9702435Z traceback.print_stack() 2022-11-23T02:58:19.9702667Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 0 2022-11-23T02:58:19.9702913Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:6 to store for rank: 1 2022-11-23T02:58:19.9703320Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.9703451Z File "", line 1, in 2022-11-23T02:58:19.9703663Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9703806Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9704010Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9704164Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9704362Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9704465Z self.run() 2022-11-23T02:58:19.9704674Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9704821Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9705166Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9705346Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9705722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9705828Z getattr(self, test_name)() 2022-11-23T02:58:19.9706193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9706289Z fn() 2022-11-23T02:58:19.9706658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9706780Z test(self, **param_kwargs) 2022-11-23T02:58:19.9707192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9707317Z return func(*args, **kwargs) 2022-11-23T02:58:19.9707602Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9707697Z self.run_subtests( 2022-11-23T02:58:19.9708055Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9708218Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9708585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9708737Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9709856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9709993Z output = model(*input) 2022-11-23T02:58:19.9710336Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9710460Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9710847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9711025Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9711398Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9711518Z _lazy_init(state, module) 2022-11-23T02:58:19.9711877Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9712019Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9712364Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9712476Z return func(*args, **kwargs) 2022-11-23T02:58:19.9712861Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9712963Z p_assert( 2022-11-23T02:58:19.9713306Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9713432Z traceback.print_stack() 2022-11-23T02:58:19.9713835Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:6 with 2 nodes. 2022-11-23T02:58:19.9713966Z File "", line 1, in 2022-11-23T02:58:19.9714183Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9714309Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9714516Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9714672Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9714890Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9714994Z self.run() 2022-11-23T02:58:19.9715199Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9715426Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9715788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9715905Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9716271Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9716393Z getattr(self, test_name)() 2022-11-23T02:58:19.9716757Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9716854Z fn() 2022-11-23T02:58:19.9717221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9717420Z test(self, **param_kwargs) 2022-11-23T02:58:19.9717781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9717894Z return func(*args, **kwargs) 2022-11-23T02:58:19.9718176Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9718290Z self.run_subtests( 2022-11-23T02:58:19.9718648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9718809Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9719180Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9719332Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9719716Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9719819Z output = model(*input) 2022-11-23T02:58:19.9720155Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9720298Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9720680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9720856Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9721227Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9721348Z _lazy_init(state, module) 2022-11-23T02:58:19.9721704Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9721834Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9722179Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9722304Z return func(*args, **kwargs) 2022-11-23T02:58:19.9722692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9722795Z p_assert( 2022-11-23T02:58:19.9723136Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9723263Z traceback.print_stack() 2022-11-23T02:58:19.9723510Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 0 2022-11-23T02:58:19.9723741Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:7 to store for rank: 1 2022-11-23T02:58:19.9724148Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.9724282Z File "", line 1, in 2022-11-23T02:58:19.9724494Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9724636Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9724885Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9725043Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9725256Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9725343Z self.run() 2022-11-23T02:58:19.9725549Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9725695Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9726044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9726177Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9726597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9726720Z getattr(self, test_name)() 2022-11-23T02:58:19.9727069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9727167Z fn() 2022-11-23T02:58:19.9727540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9727664Z test(self, **param_kwargs) 2022-11-23T02:58:19.9728027Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9728152Z return func(*args, **kwargs) 2022-11-23T02:58:19.9728431Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9728545Z self.run_subtests( 2022-11-23T02:58:19.9728892Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9729055Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9729425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9729578Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9729960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9730079Z output = model(*input) 2022-11-23T02:58:19.9730408Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9730549Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9730913Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9731095Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9731465Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9731586Z _lazy_init(state, module) 2022-11-23T02:58:19.9731946Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9732090Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9732431Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9732556Z return func(*args, **kwargs) 2022-11-23T02:58:19.9732936Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9733022Z p_assert( 2022-11-23T02:58:19.9733363Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9733493Z traceback.print_stack() 2022-11-23T02:58:19.9733897Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:7 with 2 nodes. 2022-11-23T02:58:19.9734026Z File "", line 1, in 2022-11-23T02:58:19.9734294Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9734446Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9734633Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9734784Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9735000Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9735103Z self.run() 2022-11-23T02:58:19.9735307Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9735453Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9735848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9735981Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9736334Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9736459Z getattr(self, test_name)() 2022-11-23T02:58:19.9736825Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9736922Z fn() 2022-11-23T02:58:19.9737291Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9737414Z test(self, **param_kwargs) 2022-11-23T02:58:19.9737773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9737899Z return func(*args, **kwargs) 2022-11-23T02:58:19.9738166Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9738280Z self.run_subtests( 2022-11-23T02:58:19.9738639Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9738801Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9739171Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9739324Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9739772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9739963Z output = model(*input) 2022-11-23T02:58:19.9740378Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9740531Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9740915Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9741093Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9741596Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9741719Z _lazy_init(state, module) 2022-11-23T02:58:19.9742169Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9742315Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9742703Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9742868Z return func(*args, **kwargs) 2022-11-23T02:58:19.9743330Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9743440Z p_assert( 2022-11-23T02:58:19.9743785Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9743912Z traceback.print_stack() 2022-11-23T02:58:19.9744225Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 0 2022-11-23T02:58:19.9744481Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:8 to store for rank: 1 2022-11-23T02:58:19.9744873Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.9745004Z File "", line 1, in 2022-11-23T02:58:19.9745217Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9745364Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9745573Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9745785Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9746000Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9746104Z self.run() 2022-11-23T02:58:19.9746295Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9746443Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9746794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9746927Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9747297Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9747422Z getattr(self, test_name)() 2022-11-23T02:58:19.9747784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9747885Z fn() 2022-11-23T02:58:19.9748241Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9748363Z test(self, **param_kwargs) 2022-11-23T02:58:19.9748729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9748855Z return func(*args, **kwargs) 2022-11-23T02:58:19.9749864Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9749991Z self.run_subtests( 2022-11-23T02:58:19.9750395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9750559Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9750911Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9751070Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9751455Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9751575Z output = model(*input) 2022-11-23T02:58:19.9751910Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9752053Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9752435Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9752613Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9752968Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9753089Z _lazy_init(state, module) 2022-11-23T02:58:19.9753447Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9753600Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9753948Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9754164Z return func(*args, **kwargs) 2022-11-23T02:58:19.9754567Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9754671Z p_assert( 2022-11-23T02:58:19.9754999Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9755127Z traceback.print_stack() 2022-11-23T02:58:19.9755534Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:8 with 2 nodes. 2022-11-23T02:58:19.9755664Z File "", line 1, in 2022-11-23T02:58:19.9755881Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9756092Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9756297Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9756432Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9756651Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9756755Z self.run() 2022-11-23T02:58:19.9756961Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9757109Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9757462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9757596Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9757963Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9758073Z getattr(self, test_name)() 2022-11-23T02:58:19.9758438Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9758536Z fn() 2022-11-23T02:58:19.9758912Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9759034Z test(self, **param_kwargs) 2022-11-23T02:58:19.9759398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9759524Z return func(*args, **kwargs) 2022-11-23T02:58:19.9759804Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9759901Z self.run_subtests( 2022-11-23T02:58:19.9760261Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9760427Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9760794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9760952Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9761337Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9761457Z output = model(*input) 2022-11-23T02:58:19.9761787Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9761910Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9762292Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9762473Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9762842Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9762969Z _lazy_init(state, module) 2022-11-23T02:58:19.9763328Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9763519Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9763874Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9763983Z return func(*args, **kwargs) 2022-11-23T02:58:19.9764369Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9764471Z p_assert( 2022-11-23T02:58:19.9764809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9764935Z traceback.print_stack() 2022-11-23T02:58:19.9765184Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 0 2022-11-23T02:58:19.9765498Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:9 to store for rank: 1 2022-11-23T02:58:19.9765907Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.9766024Z File "", line 1, in 2022-11-23T02:58:19.9766239Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9766382Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9766588Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9766739Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9766954Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9767057Z self.run() 2022-11-23T02:58:19.9767263Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9767395Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9767745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9767877Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9768245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9768371Z getattr(self, test_name)() 2022-11-23T02:58:19.9768738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9768836Z fn() 2022-11-23T02:58:19.9769206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9769313Z test(self, **param_kwargs) 2022-11-23T02:58:19.9769674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9769805Z return func(*args, **kwargs) 2022-11-23T02:58:19.9770086Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9770199Z self.run_subtests( 2022-11-23T02:58:19.9770561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9770724Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9771093Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9771230Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9771612Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9771731Z output = model(*input) 2022-11-23T02:58:19.9772061Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9772206Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9772586Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9772816Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9773203Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9773308Z _lazy_init(state, module) 2022-11-23T02:58:19.9773667Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9773811Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9774152Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9774277Z return func(*args, **kwargs) 2022-11-23T02:58:19.9774658Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9774823Z p_assert( 2022-11-23T02:58:19.9775168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9775282Z traceback.print_stack() 2022-11-23T02:58:19.9775692Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:9 with 2 nodes. 2022-11-23T02:58:19.9775822Z File "", line 1, in 2022-11-23T02:58:19.9776038Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9776181Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9776385Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9776537Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9776755Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9776845Z self.run() 2022-11-23T02:58:19.9777050Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9777196Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9777547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9777680Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9778045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9778168Z getattr(self, test_name)() 2022-11-23T02:58:19.9778516Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9778615Z fn() 2022-11-23T02:58:19.9778981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9779109Z test(self, **param_kwargs) 2022-11-23T02:58:19.9779468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9779593Z return func(*args, **kwargs) 2022-11-23T02:58:19.9779878Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9779993Z self.run_subtests( 2022-11-23T02:58:19.9780334Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9780497Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9780867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9781022Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9781402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9781525Z output = model(*input) 2022-11-23T02:58:19.9781854Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9781997Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9782407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9782590Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9782967Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9783091Z _lazy_init(state, module) 2022-11-23T02:58:19.9783445Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9783589Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9783986Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9784181Z return func(*args, **kwargs) 2022-11-23T02:58:19.9784557Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9784664Z p_assert( 2022-11-23T02:58:19.9785007Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9785137Z traceback.print_stack() 2022-11-23T02:58:19.9785387Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 0 2022-11-23T02:58:19.9785632Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:10 to store for rank: 1 2022-11-23T02:58:19.9786038Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.9786169Z File "", line 1, in 2022-11-23T02:58:19.9786371Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9786515Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9786720Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9786875Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9787092Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9787196Z self.run() 2022-11-23T02:58:19.9787405Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9787552Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9787879Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9788012Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9788380Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9788507Z getattr(self, test_name)() 2022-11-23T02:58:19.9788873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9789453Z fn() 2022-11-23T02:58:19.9789948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9790073Z test(self, **param_kwargs) 2022-11-23T02:58:19.9790417Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9790543Z return func(*args, **kwargs) 2022-11-23T02:58:19.9790825Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9790938Z self.run_subtests( 2022-11-23T02:58:19.9791298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9791464Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9791834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9792064Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9792444Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9792565Z output = model(*input) 2022-11-23T02:58:19.9792897Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9793039Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9793421Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9793601Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9794043Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9794167Z _lazy_init(state, module) 2022-11-23T02:58:19.9794510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9794656Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9795000Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9795128Z return func(*args, **kwargs) 2022-11-23T02:58:19.9795509Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9795613Z p_assert( 2022-11-23T02:58:19.9795956Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9796084Z traceback.print_stack() 2022-11-23T02:58:19.9796478Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:10 with 2 nodes. 2022-11-23T02:58:19.9796607Z File "", line 1, in 2022-11-23T02:58:19.9796821Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9796968Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9797174Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9797326Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9797539Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9797644Z self.run() 2022-11-23T02:58:19.9797835Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9797983Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9798331Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9798470Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9798836Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9798960Z getattr(self, test_name)() 2022-11-23T02:58:19.9799327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9799408Z fn() 2022-11-23T02:58:19.9799781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9799904Z test(self, **param_kwargs) 2022-11-23T02:58:19.9800264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9800390Z return func(*args, **kwargs) 2022-11-23T02:58:19.9800673Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9800793Z self.run_subtests( 2022-11-23T02:58:19.9801153Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9801300Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9801720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9801881Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9802265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9802387Z output = model(*input) 2022-11-23T02:58:19.9802718Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9802860Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9803244Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9803469Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9803829Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9803955Z _lazy_init(state, module) 2022-11-23T02:58:19.9804310Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9804454Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9804798Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9804923Z return func(*args, **kwargs) 2022-11-23T02:58:19.9805308Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9805410Z p_assert( 2022-11-23T02:58:19.9805740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9805867Z traceback.print_stack() 2022-11-23T02:58:19.9806118Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 1 2022-11-23T02:58:19.9806366Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:11 to store for rank: 0 2022-11-23T02:58:19.9806770Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.9806903Z File "", line 1, in 2022-11-23T02:58:19.9807122Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9807266Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9807455Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9807605Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9807826Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9807931Z self.run() 2022-11-23T02:58:19.9808137Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9808287Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9808635Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9808752Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9809120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9809243Z getattr(self, test_name)() 2022-11-23T02:58:19.9809607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9809706Z fn() 2022-11-23T02:58:19.9810076Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9810202Z test(self, **param_kwargs) 2022-11-23T02:58:19.9810562Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9810672Z return func(*args, **kwargs) 2022-11-23T02:58:19.9811002Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9811123Z self.run_subtests( 2022-11-23T02:58:19.9811486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9811652Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9812022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9812181Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9812610Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9812711Z output = model(*input) 2022-11-23T02:58:19.9813044Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9813193Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9813572Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9813748Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9814122Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9814244Z _lazy_init(state, module) 2022-11-23T02:58:19.9814601Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9814728Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9815072Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9815198Z return func(*args, **kwargs) 2022-11-23T02:58:19.9815589Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9815693Z p_assert( 2022-11-23T02:58:19.9816034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9816160Z traceback.print_stack() 2022-11-23T02:58:19.9816566Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:11 with 2 nodes. 2022-11-23T02:58:19.9816679Z File "", line 1, in 2022-11-23T02:58:19.9816894Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9817036Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9817242Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9817394Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9817610Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9817718Z self.run() 2022-11-23T02:58:19.9817924Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9818054Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9818400Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9818532Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9818897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9819019Z getattr(self, test_name)() 2022-11-23T02:58:19.9819389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9819490Z fn() 2022-11-23T02:58:19.9819862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9819967Z test(self, **param_kwargs) 2022-11-23T02:58:19.9820372Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9820506Z return func(*args, **kwargs) 2022-11-23T02:58:19.9820788Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9820902Z self.run_subtests( 2022-11-23T02:58:19.9821263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9821428Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9821794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9821977Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9822360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9822483Z output = model(*input) 2022-11-23T02:58:19.9822816Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9822957Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9823340Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9823516Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9823889Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9823992Z _lazy_init(state, module) 2022-11-23T02:58:19.9824351Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9824494Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9824838Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9824964Z return func(*args, **kwargs) 2022-11-23T02:58:19.9825350Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9825453Z p_assert( 2022-11-23T02:58:19.9825792Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9825901Z traceback.print_stack() 2022-11-23T02:58:19.9826153Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 0 2022-11-23T02:58:19.9826397Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:12 to store for rank: 1 2022-11-23T02:58:19.9826806Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.9826936Z File "", line 1, in 2022-11-23T02:58:19.9827151Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9827294Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9827501Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9827636Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9827854Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9827958Z self.run() 2022-11-23T02:58:19.9828163Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9828309Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9828653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9828790Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9829869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9830085Z getattr(self, test_name)() 2022-11-23T02:58:19.9830478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9830577Z fn() 2022-11-23T02:58:19.9830949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9831071Z test(self, **param_kwargs) 2022-11-23T02:58:19.9831431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9831555Z return func(*args, **kwargs) 2022-11-23T02:58:19.9831820Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9832001Z self.run_subtests( 2022-11-23T02:58:19.9832368Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9832537Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9832911Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9833065Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9833444Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9833565Z output = model(*input) 2022-11-23T02:58:19.9833878Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9834021Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9834405Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9834582Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9834957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9835077Z _lazy_init(state, module) 2022-11-23T02:58:19.9835433Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9835578Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9835919Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9836027Z return func(*args, **kwargs) 2022-11-23T02:58:19.9836410Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9836517Z p_assert( 2022-11-23T02:58:19.9836856Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9836981Z traceback.print_stack() 2022-11-23T02:58:19.9837388Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:12 with 2 nodes. 2022-11-23T02:58:19.9837518Z File "", line 1, in 2022-11-23T02:58:19.9837716Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9837861Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9838064Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9838216Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9838432Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9838536Z self.run() 2022-11-23T02:58:19.9838746Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9838893Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9839220Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9839399Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9839778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9839901Z getattr(self, test_name)() 2022-11-23T02:58:19.9840265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9840364Z fn() 2022-11-23T02:58:19.9840731Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9840855Z test(self, **param_kwargs) 2022-11-23T02:58:19.9841198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9841373Z return func(*args, **kwargs) 2022-11-23T02:58:19.9841655Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9841773Z self.run_subtests( 2022-11-23T02:58:19.9842135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9842297Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9842662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9842817Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9843181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9843301Z output = model(*input) 2022-11-23T02:58:19.9843635Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9843776Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9844160Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9844338Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9844710Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9844831Z _lazy_init(state, module) 2022-11-23T02:58:19.9845170Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9845314Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9845653Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9845783Z return func(*args, **kwargs) 2022-11-23T02:58:19.9846169Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9846270Z p_assert( 2022-11-23T02:58:19.9846613Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9846739Z traceback.print_stack() 2022-11-23T02:58:19.9846972Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 0 2022-11-23T02:58:19.9847216Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:13 to store for rank: 1 2022-11-23T02:58:19.9847624Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.9847755Z File "", line 1, in 2022-11-23T02:58:19.9847969Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9848115Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9848320Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9848471Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9848720Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9848833Z self.run() 2022-11-23T02:58:19.9849041Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9849191Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9849536Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9849670Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9850038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9850161Z getattr(self, test_name)() 2022-11-23T02:58:19.9850630Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9850729Z fn() 2022-11-23T02:58:19.9851098Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9851225Z test(self, **param_kwargs) 2022-11-23T02:58:19.9851584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9851709Z return func(*args, **kwargs) 2022-11-23T02:58:19.9851993Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9852089Z self.run_subtests( 2022-11-23T02:58:19.9852448Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9852610Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9852984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9853137Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9853519Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9853640Z output = model(*input) 2022-11-23T02:58:19.9853972Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9854116Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9854480Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9854660Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9855034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9855160Z _lazy_init(state, module) 2022-11-23T02:58:19.9855516Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9855659Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9856005Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9856130Z return func(*args, **kwargs) 2022-11-23T02:58:19.9856503Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9856606Z p_assert( 2022-11-23T02:58:19.9856949Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9857074Z traceback.print_stack() 2022-11-23T02:58:19.9857481Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:13 with 2 nodes. 2022-11-23T02:58:19.9857616Z File "", line 1, in 2022-11-23T02:58:19.9857830Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9857957Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9858210Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9858369Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9858588Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9858692Z self.run() 2022-11-23T02:58:19.9858899Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9859044Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9859393Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9859508Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9859927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9860051Z getattr(self, test_name)() 2022-11-23T02:58:19.9860414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9860515Z fn() 2022-11-23T02:58:19.9860886Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9861009Z test(self, **param_kwargs) 2022-11-23T02:58:19.9861367Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9861476Z return func(*args, **kwargs) 2022-11-23T02:58:19.9861761Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 258, in test_mixture_of_experts_with_delay_before_free 2022-11-23T02:58:19.9861875Z self.run_subtests( 2022-11-23T02:58:19.9862238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9862400Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9862770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9862923Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9863301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9863404Z output = model(*input) 2022-11-23T02:58:19.9863736Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9863877Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9864258Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9864443Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9864819Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9864938Z _lazy_init(state, module) 2022-11-23T02:58:19.9865299Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:19.9865426Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9865769Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9865894Z return func(*args, **kwargs) 2022-11-23T02:58:19.9866279Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9866381Z p_assert( 2022-11-23T02:58:19.9866722Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9866852Z traceback.print_stack() 2022-11-23T02:58:19.9867101Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 0 2022-11-23T02:58:19.9867328Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:14 to store for rank: 1 2022-11-23T02:58:19.9867780Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.9868195Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:14 with 2 nodes. 2022-11-23T02:58:19.9868442Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 0 2022-11-23T02:58:19.9868684Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:15 to store for rank: 1 2022-11-23T02:58:19.9869523Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.9870012Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:15 with 2 nodes. 2022-11-23T02:58:19.9870257Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 0 2022-11-23T02:58:19.9870496Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:16 to store for rank: 1 2022-11-23T02:58:19.9870903Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.9871303Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:16 with 2 nodes. 2022-11-23T02:58:19.9871547Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 0 2022-11-23T02:58:19.9871767Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:17 to store for rank: 1 2022-11-23T02:58:19.9872172Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.9872566Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:17 with 2 nodes. 2022-11-23T02:58:19.9873333Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9873583Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 0 2022-11-23T02:58:19.9873826Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:18 to store for rank: 1 2022-11-23T02:58:19.9874222Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.9874625Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:18 with 2 nodes. 2022-11-23T02:58:19.9875386Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9875634Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 0 2022-11-23T02:58:19.9876031Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.9876259Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:19 to store for rank: 1 2022-11-23T02:58:19.9876659Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:19 with 2 nodes. 2022-11-23T02:58:19.9876909Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 0 2022-11-23T02:58:19.9877149Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:20 to store for rank: 1 2022-11-23T02:58:19.9877608Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.9878015Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:20 with 2 nodes. 2022-11-23T02:58:19.9878260Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 1 2022-11-23T02:58:19.9878501Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:21 to store for rank: 0 2022-11-23T02:58:19.9878898Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.9879325Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:21 with 2 nodes. 2022-11-23T02:58:19.9879573Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 1 2022-11-23T02:58:19.9879814Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:22 to store for rank: 0 2022-11-23T02:58:19.9880210Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.9880605Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:22 with 2 nodes. 2022-11-23T02:58:19.9880849Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 0 2022-11-23T02:58:19.9881090Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:23 to store for rank: 1 2022-11-23T02:58:19.9881485Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.9881890Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:23 with 2 nodes. 2022-11-23T02:58:19.9882138Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 0 2022-11-23T02:58:19.9882358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:24 to store for rank: 1 2022-11-23T02:58:19.9882756Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.9883151Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:24 with 2 nodes. 2022-11-23T02:58:19.9883908Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9884158Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 1 2022-11-23T02:58:19.9884397Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:25 to store for rank: 0 2022-11-23T02:58:19.9884794Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.9885194Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:25 with 2 nodes. 2022-11-23T02:58:19.9885440Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 0 2022-11-23T02:58:19.9885682Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:26 to store for rank: 1 2022-11-23T02:58:19.9886064Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:58:19.9886462Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:26 with 2 nodes. 2022-11-23T02:58:19.9886748Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 0 2022-11-23T02:58:19.9886991Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:27 to store for rank: 1 2022-11-23T02:58:19.9887388Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:58:19.9887781Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:27 with 2 nodes. 2022-11-23T02:58:19.9888024Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 0 2022-11-23T02:58:19.9888259Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:28 to store for rank: 1 2022-11-23T02:58:19.9888701Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:58:19.9889076Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:28 with 2 nodes. 2022-11-23T02:58:19.9889320Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 0 2022-11-23T02:58:19.9889561Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:29 to store for rank: 1 2022-11-23T02:58:19.9889958Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:58:19.9890352Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:29 with 2 nodes. 2022-11-23T02:58:19.9890593Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 0 2022-11-23T02:58:19.9890833Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:30 to store for rank: 1 2022-11-23T02:58:19.9891232Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:58:19.9891627Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:30 with 2 nodes. 2022-11-23T02:58:19.9892388Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9892636Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 1 2022-11-23T02:58:19.9892857Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:31 to store for rank: 0 2022-11-23T02:58:19.9893261Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:58:19.9893659Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:31 with 2 nodes. 2022-11-23T02:58:19.9893903Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 0 2022-11-23T02:58:19.9894141Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:32 to store for rank: 1 2022-11-23T02:58:19.9894538Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:58:19.9894935Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:32 with 2 nodes. 2022-11-23T02:58:19.9895686Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9895983Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 0 2022-11-23T02:58:19.9896232Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:33 to store for rank: 1 2022-11-23T02:58:19.9896613Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:58:19.9897014Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:33 with 2 nodes. 2022-11-23T02:58:19.9897261Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 0 2022-11-23T02:58:19.9897501Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:34 to store for rank: 1 2022-11-23T02:58:19.9897960Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:58:19.9898358Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:34 with 2 nodes. 2022-11-23T02:58:19.9898601Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 0 2022-11-23T02:58:19.9898842Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:35 to store for rank: 1 2022-11-23T02:58:19.9899237Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:58:19.9899612Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:35 with 2 nodes. 2022-11-23T02:58:19.9899856Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 1 2022-11-23T02:58:19.9900100Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:36 to store for rank: 0 2022-11-23T02:58:19.9900501Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:58:19.9900896Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:36 with 2 nodes. 2022-11-23T02:58:19.9901655Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:19.9901901Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 1 2022-11-23T02:58:19.9902140Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:37 to store for rank: 0 2022-11-23T02:58:19.9902537Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:58:19.9902935Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:37 with 2 nodes. 2022-11-23T02:58:19.9903031Z dist init r=0, world=2 2022-11-23T02:58:19.9903369Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9903706Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9904029Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9904350Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9904707Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9905024Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9905333Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9905638Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9905944Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9906300Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9906612Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9906901Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:19.9907018Z dist init r=1, world=2 2022-11-23T02:58:19.9907344Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9907662Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9907979Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9908294Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9908604Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9908916Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9909663Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9909981Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9910293Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9910584Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9910897Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9911205Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:19.9911309Z ok (24.445s) 2022-11-23T02:58:19.9911677Z test_nested_always_wrap_model_offload_false_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90763 2022-11-23T02:58:19.9911904Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90764 2022-11-23T02:58:19.9912362Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.9912550Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.9912944Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.9913139Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.9913493Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.9913672Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.9914135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.9914327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.9914579Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.9914829Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.9915236Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.9915637Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.9915873Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.9916087Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.9916331Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9916570Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9917607Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.9917723Z warnings.warn( 2022-11-23T02:58:19.9918754Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.9918872Z warnings.warn( 2022-11-23T02:58:19.9919114Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9919350Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9919585Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9919823Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9920038Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9920269Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9920504Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9920738Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9921027Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9921269Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9921500Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9921733Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9921943Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9922175Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9922402Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9922682Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9922912Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9923145Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9923374Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9923604Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9923812Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9924040Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9924270Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9924497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9924729Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9924956Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9925190Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9925420Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9925650Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9925858Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9926088Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9926315Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9926546Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9926773Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9927005Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9927231Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9927462Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9927671Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9927900Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9928128Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9928357Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9928589Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9928817Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9929091Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9929332Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9929540Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9929659Z dist init r=0, world=2 2022-11-23T02:58:19.9929770Z dist init r=1, world=2 2022-11-23T02:58:19.9929871Z ok (5.814s) 2022-11-23T02:58:19.9930231Z test_nested_always_wrap_model_offload_false_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90846 2022-11-23T02:58:19.9930455Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90847 2022-11-23T02:58:19.9930892Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.9931073Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.9931441Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.9931636Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.9932010Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.9932187Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.9932570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.9932761Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.9933012Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.9933260Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.9933667Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.9934053Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.9934287Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.9934518Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.9934755Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9934991Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9936029Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.9936142Z warnings.warn( 2022-11-23T02:58:19.9937167Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.9937282Z warnings.warn( 2022-11-23T02:58:19.9937520Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9937755Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9938021Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9938260Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9938495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9938728Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9938960Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9939192Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9939468Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9939699Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9939912Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9940147Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9940377Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9940608Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9940835Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9941065Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9941292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9941527Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9941756Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9941969Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9942197Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9942427Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9942654Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9942881Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9943107Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9943333Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9943565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9943774Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9944004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9944234Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9944462Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9944691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9944917Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9945145Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9945373Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9945606Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9945815Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9946103Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9946339Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9946565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9946793Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9947025Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9947255Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9947530Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9947739Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9947968Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9948085Z dist init r=0, world=2 2022-11-23T02:58:19.9948198Z dist init r=1, world=2 2022-11-23T02:58:19.9948299Z ok (6.214s) 2022-11-23T02:58:19.9948754Z test_nested_always_wrap_model_offload_false_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 90929 2022-11-23T02:58:19.9949405Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 90930 2022-11-23T02:58:19.9949808Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.9949969Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.9950392Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.9950586Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.9950963Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.9951141Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.9951526Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.9951719Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.9951968Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.9952196Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.9952605Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.9953007Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.9953245Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.9953479Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.9953716Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9953956Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9954986Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.9955105Z warnings.warn( 2022-11-23T02:58:19.9956202Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.9956321Z warnings.warn( 2022-11-23T02:58:19.9956559Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9956777Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9957070Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9957306Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9957540Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9957772Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9958004Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9958238Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9958467Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9958677Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9958908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9959140Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9959370Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9959602Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9959832Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9960061Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9960291Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9960499Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9960729Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9960957Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9961191Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9961420Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9961655Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9961885Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9962115Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9962342Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9962553Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9962782Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9963015Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9963242Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9963515Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9963751Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9963981Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9964209Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9964418Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9964646Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9964874Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9965148Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9965379Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9965611Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9965840Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9966072Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9966298Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9966508Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9966734Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9966964Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9967081Z dist init r=0, world=2 2022-11-23T02:58:19.9967192Z dist init r=1, world=2 2022-11-23T02:58:19.9967292Z ok (6.214s) 2022-11-23T02:58:19.9967658Z test_nested_always_wrap_model_offload_true_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91012 2022-11-23T02:58:19.9967866Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91013 2022-11-23T02:58:19.9968253Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.9968434Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.9968818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.9969012Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.9969390Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:19.9969567Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:19.9969951Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:19.9970145Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:19.9970377Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:19.9970627Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:19.9971032Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.9971431Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:19.9971669Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:19.9971904Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:19.9972187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9972431Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9973465Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.9973581Z warnings.warn( 2022-11-23T02:58:19.9974677Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:19.9974771Z warnings.warn( 2022-11-23T02:58:19.9974903Z File "", line 1, in 2022-11-23T02:58:19.9975123Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9975267Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9975480Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9975633Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9975851Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9975959Z self.run() 2022-11-23T02:58:19.9976147Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9976296Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9976653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9976789Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9977159Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9977284Z getattr(self, test_name)() 2022-11-23T02:58:19.9977654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9977752Z fn() 2022-11-23T02:58:19.9978108Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9978237Z test(self, **param_kwargs) 2022-11-23T02:58:19.9978599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9978725Z return func(*args, **kwargs) 2022-11-23T02:58:19.9978992Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:19.9979111Z self.run_subtests( 2022-11-23T02:58:19.9979471Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9979636Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9979991Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9980146Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9980526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9980654Z output = model(*input) 2022-11-23T02:58:19.9980985Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9981127Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9981559Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9981746Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9982103Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9982227Z _lazy_init(state, module) 2022-11-23T02:58:19.9982584Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:19.9982729Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9983128Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9983257Z return func(*args, **kwargs) 2022-11-23T02:58:19.9983646Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9983752Z p_assert( 2022-11-23T02:58:19.9984079Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9984208Z traceback.print_stack() 2022-11-23T02:58:19.9984339Z File "", line 1, in 2022-11-23T02:58:19.9984552Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9984696Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9984902Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9985054Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9985259Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9985364Z self.run() 2022-11-23T02:58:19.9985570Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9985717Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9986068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9986205Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9986575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9986700Z getattr(self, test_name)() 2022-11-23T02:58:19.9987048Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9987146Z fn() 2022-11-23T02:58:19.9987518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9987646Z test(self, **param_kwargs) 2022-11-23T02:58:19.9988008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9988136Z return func(*args, **kwargs) 2022-11-23T02:58:19.9988404Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:19.9988520Z self.run_subtests( 2022-11-23T02:58:19.9988862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9989415Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9989799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9989957Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:19.9990340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:19.9990467Z output = model(*input) 2022-11-23T02:58:19.9990799Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:19.9991016Z return forward_call(*input, **kwargs) 2022-11-23T02:58:19.9991395Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:19.9991578Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:19.9991950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:19.9992072Z _lazy_init(state, module) 2022-11-23T02:58:19.9992428Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:19.9992573Z handle.init_flat_param_attributes() 2022-11-23T02:58:19.9992981Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:19.9993106Z return func(*args, **kwargs) 2022-11-23T02:58:19.9993478Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:19.9993582Z p_assert( 2022-11-23T02:58:19.9993926Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:19.9994054Z traceback.print_stack() 2022-11-23T02:58:19.9994298Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9994538Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:19.9994670Z File "", line 1, in 2022-11-23T02:58:19.9994884Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:19.9995014Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:19.9995219Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:19.9995371Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:19.9995587Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:19.9995693Z self.run() 2022-11-23T02:58:19.9995896Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:19.9996042Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:19.9996371Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:19.9996508Z self.run_test(test_name, pipe) 2022-11-23T02:58:19.9996876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:19.9997000Z getattr(self, test_name)() 2022-11-23T02:58:19.9997363Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:19.9997465Z fn() 2022-11-23T02:58:19.9997838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:19.9997964Z test(self, **param_kwargs) 2022-11-23T02:58:19.9998309Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:19.9998437Z return func(*args, **kwargs) 2022-11-23T02:58:19.9998697Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:19.9998812Z self.run_subtests( 2022-11-23T02:58:19.9999171Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:19.9999338Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:19.9999708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:19.9999867Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0000279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0000407Z output = model(*input) 2022-11-23T02:58:20.0000741Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0000885Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0001270Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0001448Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0001820Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0001990Z _lazy_init(state, module) 2022-11-23T02:58:20.0002331Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0002476Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0002825Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0002952Z return func(*args, **kwargs) 2022-11-23T02:58:20.0003338Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0003441Z p_assert( 2022-11-23T02:58:20.0003784Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0003912Z traceback.print_stack() 2022-11-23T02:58:20.0004024Z File "", line 1, in 2022-11-23T02:58:20.0004237Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0004383Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0004588Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0004744Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0004964Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0005069Z self.run() 2022-11-23T02:58:20.0005275Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0005405Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0005749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0005884Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0006252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0006375Z getattr(self, test_name)() 2022-11-23T02:58:20.0006744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0006842Z fn() 2022-11-23T02:58:20.0007195Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0007324Z test(self, **param_kwargs) 2022-11-23T02:58:20.0007685Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0007812Z return func(*args, **kwargs) 2022-11-23T02:58:20.0008071Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0008189Z self.run_subtests( 2022-11-23T02:58:20.0008550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0008713Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0009069Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0009223Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0009648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0009772Z output = model(*input) 2022-11-23T02:58:20.0010109Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0010251Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0010629Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0010806Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0011179Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0011332Z _lazy_init(state, module) 2022-11-23T02:58:20.0011692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0011836Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0012184Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0012311Z return func(*args, **kwargs) 2022-11-23T02:58:20.0012697Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0012798Z p_assert( 2022-11-23T02:58:20.0013140Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0013250Z traceback.print_stack() 2022-11-23T02:58:20.0013494Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0013730Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0013864Z File "", line 1, in 2022-11-23T02:58:20.0014077Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0014220Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0014430Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0014564Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0014780Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0014884Z self.run() 2022-11-23T02:58:20.0015090Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0015237Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0015586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0015720Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0016092Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0016197Z getattr(self, test_name)() 2022-11-23T02:58:20.0016567Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0016666Z fn() 2022-11-23T02:58:20.0017039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0017162Z test(self, **param_kwargs) 2022-11-23T02:58:20.0017526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0017652Z return func(*args, **kwargs) 2022-11-23T02:58:20.0017910Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0018007Z self.run_subtests( 2022-11-23T02:58:20.0018367Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0018531Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0018948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0019108Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0019494Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0019615Z output = model(*input) 2022-11-23T02:58:20.0019944Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0020069Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0020451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0020677Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0021053Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0021174Z _lazy_init(state, module) 2022-11-23T02:58:20.0021535Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0021680Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0022027Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0022134Z return func(*args, **kwargs) 2022-11-23T02:58:20.0022516Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0022623Z p_assert( 2022-11-23T02:58:20.0022962Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0023094Z traceback.print_stack() 2022-11-23T02:58:20.0023223Z File "", line 1, in 2022-11-23T02:58:20.0023436Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0023582Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0023774Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0023925Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0024141Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0024245Z self.run() 2022-11-23T02:58:20.0024451Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0024599Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0024945Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0025065Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0025431Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0025555Z getattr(self, test_name)() 2022-11-23T02:58:20.0025923Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0026022Z fn() 2022-11-23T02:58:20.0026390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0026516Z test(self, **param_kwargs) 2022-11-23T02:58:20.0026876Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0026984Z return func(*args, **kwargs) 2022-11-23T02:58:20.0027246Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0027364Z self.run_subtests( 2022-11-23T02:58:20.0027725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0027889Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0028307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0028470Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0028856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0029322Z output = model(*input) 2022-11-23T02:58:20.0029674Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0029820Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0030204Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0030458Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0030837Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0030963Z _lazy_init(state, module) 2022-11-23T02:58:20.0031321Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0031446Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0031789Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0031914Z return func(*args, **kwargs) 2022-11-23T02:58:20.0032299Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0032401Z p_assert( 2022-11-23T02:58:20.0032743Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0032875Z traceback.print_stack() 2022-11-23T02:58:20.0033118Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0033341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0033475Z File "", line 1, in 2022-11-23T02:58:20.0033687Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0033831Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0034036Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0034189Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0034408Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0034512Z self.run() 2022-11-23T02:58:20.0034700Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0034850Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0035196Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0035331Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0035703Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0035829Z getattr(self, test_name)() 2022-11-23T02:58:20.0036193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0036273Z fn() 2022-11-23T02:58:20.0036643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0036767Z test(self, **param_kwargs) 2022-11-23T02:58:20.0037129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0037259Z return func(*args, **kwargs) 2022-11-23T02:58:20.0037521Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0037634Z self.run_subtests( 2022-11-23T02:58:20.0038054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0038206Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0038578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0038732Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0039115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0039236Z output = model(*input) 2022-11-23T02:58:20.0039565Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0039771Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0040158Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0040323Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0040697Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0040818Z _lazy_init(state, module) 2022-11-23T02:58:20.0041172Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0041316Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0041658Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0041783Z return func(*args, **kwargs) 2022-11-23T02:58:20.0042178Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0042262Z p_assert( 2022-11-23T02:58:20.0042603Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0042731Z traceback.print_stack() 2022-11-23T02:58:20.0042863Z File "", line 1, in 2022-11-23T02:58:20.0043076Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0043220Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0043425Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0043578Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0043775Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0043881Z self.run() 2022-11-23T02:58:20.0044087Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0044241Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0044589Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0044723Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0045096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0045219Z getattr(self, test_name)() 2022-11-23T02:58:20.0045567Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0045667Z fn() 2022-11-23T02:58:20.0046037Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0046159Z test(self, **param_kwargs) 2022-11-23T02:58:20.0046524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0046653Z return func(*args, **kwargs) 2022-11-23T02:58:20.0046912Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0047008Z self.run_subtests( 2022-11-23T02:58:20.0047412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0047584Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0047958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0048111Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0048492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0048613Z output = model(*input) 2022-11-23T02:58:20.0048945Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0049135Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0049502Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0049683Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0050057Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0050182Z _lazy_init(state, module) 2022-11-23T02:58:20.0050585Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0050732Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0051077Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0051203Z return func(*args, **kwargs) 2022-11-23T02:58:20.0051576Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0051681Z p_assert( 2022-11-23T02:58:20.0052022Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0052149Z traceback.print_stack() 2022-11-23T02:58:20.0052392Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0052633Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0052767Z File "", line 1, in 2022-11-23T02:58:20.0052960Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0053103Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0053308Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0053461Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0053680Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0053786Z self.run() 2022-11-23T02:58:20.0053992Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0054143Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0054473Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0054608Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0054975Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0055099Z getattr(self, test_name)() 2022-11-23T02:58:20.0055464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0055562Z fn() 2022-11-23T02:58:20.0055933Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0056061Z test(self, **param_kwargs) 2022-11-23T02:58:20.0056404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0056579Z return func(*args, **kwargs) 2022-11-23T02:58:20.0056846Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0056961Z self.run_subtests( 2022-11-23T02:58:20.0057319Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0057484Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0057855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0058011Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0058447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0058570Z output = model(*input) 2022-11-23T02:58:20.0058903Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0059047Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0059427Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0059606Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0059975Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0060097Z _lazy_init(state, module) 2022-11-23T02:58:20.0060436Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0060584Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0060926Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0061051Z return func(*args, **kwargs) 2022-11-23T02:58:20.0061437Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0061544Z p_assert( 2022-11-23T02:58:20.0061885Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0062013Z traceback.print_stack() 2022-11-23T02:58:20.0062125Z File "", line 1, in 2022-11-23T02:58:20.0062338Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0062480Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0062688Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0062844Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0063061Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0063165Z self.run() 2022-11-23T02:58:20.0063352Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0063507Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0063857Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0063990Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0064359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0064481Z getattr(self, test_name)() 2022-11-23T02:58:20.0064849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0064948Z fn() 2022-11-23T02:58:20.0065303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0065426Z test(self, **param_kwargs) 2022-11-23T02:58:20.0065787Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0065956Z return func(*args, **kwargs) 2022-11-23T02:58:20.0066222Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0066337Z self.run_subtests( 2022-11-23T02:58:20.0066698Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0066861Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0067211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0067365Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0067794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0067913Z output = model(*input) 2022-11-23T02:58:20.0068248Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0068393Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0068777Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0069344Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0069721Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0069846Z _lazy_init(state, module) 2022-11-23T02:58:20.0070202Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0070351Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0070696Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0070822Z return func(*args, **kwargs) 2022-11-23T02:58:20.0071210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0071315Z p_assert( 2022-11-23T02:58:20.0071636Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0071763Z traceback.print_stack() 2022-11-23T02:58:20.0072005Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0072243Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0072373Z File "", line 1, in 2022-11-23T02:58:20.0072589Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0072740Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0072946Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0073079Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0073299Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0073403Z self.run() 2022-11-23T02:58:20.0073611Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0073758Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0074106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0074240Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0074610Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0074721Z getattr(self, test_name)() 2022-11-23T02:58:20.0075085Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0075184Z fn() 2022-11-23T02:58:20.0075623Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0075756Z test(self, **param_kwargs) 2022-11-23T02:58:20.0076124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0076251Z return func(*args, **kwargs) 2022-11-23T02:58:20.0076492Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0076608Z self.run_subtests( 2022-11-23T02:58:20.0076964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0077126Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0077560Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0077714Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0078100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0078224Z output = model(*input) 2022-11-23T02:58:20.0078537Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0078678Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0079059Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0079238Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0079610Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0079738Z _lazy_init(state, module) 2022-11-23T02:58:20.0080095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0080242Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0080586Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0080695Z return func(*args, **kwargs) 2022-11-23T02:58:20.0081078Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0081179Z p_assert( 2022-11-23T02:58:20.0081520Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0081648Z traceback.print_stack() 2022-11-23T02:58:20.0081777Z File "", line 1, in 2022-11-23T02:58:20.0081995Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0082120Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0082325Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0082480Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0082696Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0082801Z self.run() 2022-11-23T02:58:20.0083008Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0083155Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0083502Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0083617Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0084039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0084171Z getattr(self, test_name)() 2022-11-23T02:58:20.0084541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0084639Z fn() 2022-11-23T02:58:20.0085054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0085185Z test(self, **param_kwargs) 2022-11-23T02:58:20.0085550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0085658Z return func(*args, **kwargs) 2022-11-23T02:58:20.0085919Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0086035Z self.run_subtests( 2022-11-23T02:58:20.0086391Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0086603Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0086977Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0087131Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0087515Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0087619Z output = model(*input) 2022-11-23T02:58:20.0087950Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0088091Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0088475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0088652Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0089026Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0089151Z _lazy_init(state, module) 2022-11-23T02:58:20.0089507Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0089636Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0089986Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0090112Z return func(*args, **kwargs) 2022-11-23T02:58:20.0090496Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0090600Z p_assert( 2022-11-23T02:58:20.0090948Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0091076Z traceback.print_stack() 2022-11-23T02:58:20.0091321Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0091546Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0091680Z File "", line 1, in 2022-11-23T02:58:20.0091897Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0092041Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0092247Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0092400Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0092615Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0092702Z self.run() 2022-11-23T02:58:20.0092909Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0093055Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0093401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0093538Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0093909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0094032Z getattr(self, test_name)() 2022-11-23T02:58:20.0094447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0094533Z fn() 2022-11-23T02:58:20.0094908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0095032Z test(self, **param_kwargs) 2022-11-23T02:58:20.0095391Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0095517Z return func(*args, **kwargs) 2022-11-23T02:58:20.0095779Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0095939Z self.run_subtests( 2022-11-23T02:58:20.0096298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0096444Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0096821Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0096976Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0097357Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0097477Z output = model(*input) 2022-11-23T02:58:20.0097809Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0097951Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0098336Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0098500Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0098877Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0099000Z _lazy_init(state, module) 2022-11-23T02:58:20.0099358Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0099502Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0099845Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0099971Z return func(*args, **kwargs) 2022-11-23T02:58:20.0100355Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0100439Z p_assert( 2022-11-23T02:58:20.0100786Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0100914Z traceback.print_stack() 2022-11-23T02:58:20.0101045Z File "", line 1, in 2022-11-23T02:58:20.0101260Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0101403Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0101608Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0101760Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0101960Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0102065Z self.run() 2022-11-23T02:58:20.0102271Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0102419Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0102761Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0102899Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0103264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0103414Z getattr(self, test_name)() 2022-11-23T02:58:20.0103785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0103883Z fn() 2022-11-23T02:58:20.0104252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0104376Z test(self, **param_kwargs) 2022-11-23T02:58:20.0104735Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0104860Z return func(*args, **kwargs) 2022-11-23T02:58:20.0105121Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0105275Z self.run_subtests( 2022-11-23T02:58:20.0105639Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0105807Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0106175Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0106327Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0106709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0106831Z output = model(*input) 2022-11-23T02:58:20.0107163Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0107288Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0107670Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0107852Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0108231Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0108352Z _lazy_init(state, module) 2022-11-23T02:58:20.0108706Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0108850Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0109403Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0109513Z return func(*args, **kwargs) 2022-11-23T02:58:20.0109903Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0110009Z p_assert( 2022-11-23T02:58:20.0110356Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0110484Z traceback.print_stack() 2022-11-23T02:58:20.0110727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0110972Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0111104Z File "", line 1, in 2022-11-23T02:58:20.0111298Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0111442Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0111644Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0111796Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0112011Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0112116Z self.run() 2022-11-23T02:58:20.0112325Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0112473Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0112803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0113007Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0113388Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0113513Z getattr(self, test_name)() 2022-11-23T02:58:20.0113880Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0113978Z fn() 2022-11-23T02:58:20.0114349Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0114474Z test(self, **param_kwargs) 2022-11-23T02:58:20.0114818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0115007Z return func(*args, **kwargs) 2022-11-23T02:58:20.0115266Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0115384Z self.run_subtests( 2022-11-23T02:58:20.0115745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0115909Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0116280Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0116435Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0116801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0116924Z output = model(*input) 2022-11-23T02:58:20.0117260Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0117402Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0117787Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0117968Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0118342Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0118462Z _lazy_init(state, module) 2022-11-23T02:58:20.0118803Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0118948Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0119293Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0119425Z return func(*args, **kwargs) 2022-11-23T02:58:20.0119811Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0119915Z p_assert( 2022-11-23T02:58:20.0120260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0120390Z traceback.print_stack() 2022-11-23T02:58:20.0120503Z File "", line 1, in 2022-11-23T02:58:20.0120717Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0120860Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0121067Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0121220Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0121437Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0121545Z self.run() 2022-11-23T02:58:20.0121733Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0121881Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0122270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0122409Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0122778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0122903Z getattr(self, test_name)() 2022-11-23T02:58:20.0123267Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0123366Z fn() 2022-11-23T02:58:20.0123719Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0123845Z test(self, **param_kwargs) 2022-11-23T02:58:20.0124281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0124408Z return func(*args, **kwargs) 2022-11-23T02:58:20.0124671Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0124785Z self.run_subtests( 2022-11-23T02:58:20.0125142Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0125309Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0125661Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0125817Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0126200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0126324Z output = model(*input) 2022-11-23T02:58:20.0126654Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0126795Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0127182Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0127362Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0127717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0127837Z _lazy_init(state, module) 2022-11-23T02:58:20.0128195Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0128340Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0128680Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0128809Z return func(*args, **kwargs) 2022-11-23T02:58:20.0129196Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0129299Z p_assert( 2022-11-23T02:58:20.0129630Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0129758Z traceback.print_stack() 2022-11-23T02:58:20.0130001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0130241Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0130372Z File "", line 1, in 2022-11-23T02:58:20.0130588Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0130734Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0130939Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0131077Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0131293Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0131399Z self.run() 2022-11-23T02:58:20.0131652Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0131808Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0132157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0132295Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0132641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0132766Z getattr(self, test_name)() 2022-11-23T02:58:20.0133130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0133277Z fn() 2022-11-23T02:58:20.0133649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0133774Z test(self, **param_kwargs) 2022-11-23T02:58:20.0134140Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0134265Z return func(*args, **kwargs) 2022-11-23T02:58:20.0134507Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0134621Z self.run_subtests( 2022-11-23T02:58:20.0134982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0135146Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0135515Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0135673Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0136053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0136172Z output = model(*input) 2022-11-23T02:58:20.0136491Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0136637Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0137020Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0137195Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0137566Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0137688Z _lazy_init(state, module) 2022-11-23T02:58:20.0138042Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0138191Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0138515Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0138646Z return func(*args, **kwargs) 2022-11-23T02:58:20.0139034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0139137Z p_assert( 2022-11-23T02:58:20.0139478Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0139607Z traceback.print_stack() 2022-11-23T02:58:20.0139736Z File "", line 1, in 2022-11-23T02:58:20.0139950Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0140075Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0140280Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0140440Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0140657Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0140762Z self.run() 2022-11-23T02:58:20.0141011Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0141165Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0141493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0141626Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0141994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0142119Z getattr(self, test_name)() 2022-11-23T02:58:20.0142485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0142630Z fn() 2022-11-23T02:58:20.0143003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0143128Z test(self, **param_kwargs) 2022-11-23T02:58:20.0143472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0143601Z return func(*args, **kwargs) 2022-11-23T02:58:20.0143864Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0143979Z self.run_subtests( 2022-11-23T02:58:20.0144338Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0144501Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0144870Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0145028Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0145392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0145516Z output = model(*input) 2022-11-23T02:58:20.0145848Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0145991Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0146375Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0146555Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0146927Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0147048Z _lazy_init(state, module) 2022-11-23T02:58:20.0147407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0147539Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0147888Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0148018Z return func(*args, **kwargs) 2022-11-23T02:58:20.0148402Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0148506Z p_assert( 2022-11-23T02:58:20.0148847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0149142Z traceback.print_stack() 2022-11-23T02:58:20.0149376Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0149618Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0149750Z File "", line 1, in 2022-11-23T02:58:20.0149971Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0150116Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0150322Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0150570Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0150796Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0150882Z self.run() 2022-11-23T02:58:20.0151091Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0151238Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0151590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0151726Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0152098Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0152281Z getattr(self, test_name)() 2022-11-23T02:58:20.0152648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0152729Z fn() 2022-11-23T02:58:20.0153106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0153230Z test(self, **param_kwargs) 2022-11-23T02:58:20.0153593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0153720Z return func(*args, **kwargs) 2022-11-23T02:58:20.0153981Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0154094Z self.run_subtests( 2022-11-23T02:58:20.0154450Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0154599Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0154972Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0155131Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0155514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0155634Z output = model(*input) 2022-11-23T02:58:20.0155966Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0156108Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0156491Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0156652Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0157028Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0157150Z _lazy_init(state, module) 2022-11-23T02:58:20.0157513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0157660Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0158008Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0158134Z return func(*args, **kwargs) 2022-11-23T02:58:20.0158518Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0158603Z p_assert( 2022-11-23T02:58:20.0158947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0159075Z traceback.print_stack() 2022-11-23T02:58:20.0159209Z File "", line 1, in 2022-11-23T02:58:20.0159422Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0159567Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0159819Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0159958Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0160176Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0160281Z self.run() 2022-11-23T02:58:20.0160488Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0160635Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0160983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0161117Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0161484Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0161638Z getattr(self, test_name)() 2022-11-23T02:58:20.0162006Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0162107Z fn() 2022-11-23T02:58:20.0162480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0162601Z test(self, **param_kwargs) 2022-11-23T02:58:20.0162963Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0163089Z return func(*args, **kwargs) 2022-11-23T02:58:20.0163349Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0163447Z self.run_subtests( 2022-11-23T02:58:20.0163808Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0163975Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0164343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0164501Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0164886Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0165005Z output = model(*input) 2022-11-23T02:58:20.0165337Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0165461Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0165841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0166019Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0166396Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0166517Z _lazy_init(state, module) 2022-11-23T02:58:20.0166878Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0167023Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0167365Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0167472Z return func(*args, **kwargs) 2022-11-23T02:58:20.0167857Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0167962Z p_assert( 2022-11-23T02:58:20.0168302Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0168429Z traceback.print_stack() 2022-11-23T02:58:20.0168680Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0168918Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0169050Z File "", line 1, in 2022-11-23T02:58:20.0169289Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0169438Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0169643Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0169796Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0170011Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0170118Z self.run() 2022-11-23T02:58:20.0170323Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0170452Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0170865Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0170998Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0171369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0171494Z getattr(self, test_name)() 2022-11-23T02:58:20.0171861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0171961Z fn() 2022-11-23T02:58:20.0172333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0172437Z test(self, **param_kwargs) 2022-11-23T02:58:20.0172804Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0172929Z return func(*args, **kwargs) 2022-11-23T02:58:20.0173192Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0173304Z self.run_subtests( 2022-11-23T02:58:20.0173660Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0173829Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0174201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0174336Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0174722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0174840Z output = model(*input) 2022-11-23T02:58:20.0175172Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0175314Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0175702Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0175881Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0176258Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0176363Z _lazy_init(state, module) 2022-11-23T02:58:20.0176720Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0176865Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0177207Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0177332Z return func(*args, **kwargs) 2022-11-23T02:58:20.0177717Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0177825Z p_assert( 2022-11-23T02:58:20.0178172Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0178282Z traceback.print_stack() 2022-11-23T02:58:20.0178459Z File "", line 1, in 2022-11-23T02:58:20.0178677Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0178821Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0179027Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0179181Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0179398Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0179503Z self.run() 2022-11-23T02:58:20.0179690Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0179839Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0180235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0180369Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0180738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0180865Z getattr(self, test_name)() 2022-11-23T02:58:20.0181230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0181328Z fn() 2022-11-23T02:58:20.0181679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0181803Z test(self, **param_kwargs) 2022-11-23T02:58:20.0182165Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0182293Z return func(*args, **kwargs) 2022-11-23T02:58:20.0182558Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0182674Z self.run_subtests( 2022-11-23T02:58:20.0183041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0183188Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0183557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0183710Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0184093Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0184212Z output = model(*input) 2022-11-23T02:58:20.0184545Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0184693Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0185080Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0185257Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0185619Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0185742Z _lazy_init(state, module) 2022-11-23T02:58:20.0186099Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0186244Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0186585Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0186712Z return func(*args, **kwargs) 2022-11-23T02:58:20.0187099Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0187207Z p_assert( 2022-11-23T02:58:20.0187532Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0187659Z traceback.print_stack() 2022-11-23T02:58:20.0187948Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0188196Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0188331Z File "", line 1, in 2022-11-23T02:58:20.0188545Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0188691Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0188880Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0189209Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0189432Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0189616Z self.run() 2022-11-23T02:58:20.0189823Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0189972Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0190329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0190466Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0190817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0190943Z getattr(self, test_name)() 2022-11-23T02:58:20.0191307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0191406Z fn() 2022-11-23T02:58:20.0191776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0191903Z test(self, **param_kwargs) 2022-11-23T02:58:20.0192266Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0192393Z return func(*args, **kwargs) 2022-11-23T02:58:20.0192637Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0192751Z self.run_subtests( 2022-11-23T02:58:20.0193109Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0193273Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0193641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0193796Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0194177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0194301Z output = model(*input) 2022-11-23T02:58:20.0194614Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0194756Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0195145Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0195324Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0195695Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0195817Z _lazy_init(state, module) 2022-11-23T02:58:20.0196177Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0196322Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0196647Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0196779Z return func(*args, **kwargs) 2022-11-23T02:58:20.0197163Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0197323Z p_assert( 2022-11-23T02:58:20.0197678Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0197807Z traceback.print_stack() 2022-11-23T02:58:20.0197938Z File "", line 1, in 2022-11-23T02:58:20.0198153Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0198278Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0198481Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0198634Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0198850Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0199001Z self.run() 2022-11-23T02:58:20.0199208Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0199357Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0199690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0199824Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0200191Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0200316Z getattr(self, test_name)() 2022-11-23T02:58:20.0200678Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0200776Z fn() 2022-11-23T02:58:20.0201142Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0201270Z test(self, **param_kwargs) 2022-11-23T02:58:20.0201616Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0201743Z return func(*args, **kwargs) 2022-11-23T02:58:20.0202007Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0202122Z self.run_subtests( 2022-11-23T02:58:20.0202480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0202640Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0203009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0203163Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0203527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0203653Z output = model(*input) 2022-11-23T02:58:20.0203984Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0204130Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0204518Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0204696Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0205067Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0205188Z _lazy_init(state, module) 2022-11-23T02:58:20.0205528Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0205673Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0206022Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0206148Z return func(*args, **kwargs) 2022-11-23T02:58:20.0206618Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0206729Z p_assert( 2022-11-23T02:58:20.0207076Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0207202Z traceback.print_stack() 2022-11-23T02:58:20.0207424Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0207667Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0207904Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0208135Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0208419Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0208652Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0208889Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0209119Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0209334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0209567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0209799Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0210027Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0210259Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0210492Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0210722Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0210954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0211184Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0211395Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0211626Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0211853Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0212083Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0212312Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0212548Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0212776Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0213012Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0213224Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0213453Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0213680Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0213909Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0214136Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0214368Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0214594Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0214865Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0215084Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0215314Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0215542Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0215768Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0215997Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0216224Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0216497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0216727Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0216958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0217166Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0217393Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0217618Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0217847Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0218074Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0218302Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0218417Z dist init r=1, world=2 2022-11-23T02:58:20.0218755Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0219066Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0219402Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0219727Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0220045Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0220362Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0220678Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0221009Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0221330Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0221640Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0221950Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0222304Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0222424Z dist init r=0, world=2 2022-11-23T02:58:20.0222722Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0223036Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0223348Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0223657Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0224018Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0224328Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0224636Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0224946Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0225255Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0225568Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0225879Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0226187Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0226272Z ok (6.314s) 2022-11-23T02:58:20.0226630Z test_nested_always_wrap_model_offload_true_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91095 2022-11-23T02:58:20.0226852Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91096 2022-11-23T02:58:20.0227246Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.0227425Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.0227816Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.0228010Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.0228386Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.0228546Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.0229093Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.0229305Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.0229560Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.0229817Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.0230295Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.0230712Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.0230948Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.0231181Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.0231403Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0231641Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0232681Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.0232869Z warnings.warn( 2022-11-23T02:58:20.0233905Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.0234018Z warnings.warn( 2022-11-23T02:58:20.0234154Z File "", line 1, in 2022-11-23T02:58:20.0234370Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0234515Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0234725Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0234878Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0235078Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0235182Z self.run() 2022-11-23T02:58:20.0235387Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0235536Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0235888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0236023Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0236390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0236501Z getattr(self, test_name)() 2022-11-23T02:58:20.0236867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0236971Z fn() 2022-11-23T02:58:20.0237343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0237468Z test(self, **param_kwargs) 2022-11-23T02:58:20.0237831Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0237961Z return func(*args, **kwargs) 2022-11-23T02:58:20.0238221Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0238316Z self.run_subtests( 2022-11-23T02:58:20.0238675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0238841Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0239212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0239412Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0239806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0239927Z output = model(*input) 2022-11-23T02:58:20.0240260Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0240383Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0240763Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0240941Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0241363Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0241485Z _lazy_init(state, module) 2022-11-23T02:58:20.0241847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0241992Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0242335Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0242442Z return func(*args, **kwargs) 2022-11-23T02:58:20.0242830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0242934Z p_assert( 2022-11-23T02:58:20.0243274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0243405Z traceback.print_stack() 2022-11-23T02:58:20.0243537Z File "", line 1, in 2022-11-23T02:58:20.0243749Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0243890Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0244081Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0244235Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0244452Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0244556Z self.run() 2022-11-23T02:58:20.0244761Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0244908Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0245255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0245386Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0245744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0245869Z getattr(self, test_name)() 2022-11-23T02:58:20.0246236Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0246336Z fn() 2022-11-23T02:58:20.0246705Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0246829Z test(self, **param_kwargs) 2022-11-23T02:58:20.0247189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0247297Z return func(*args, **kwargs) 2022-11-23T02:58:20.0247558Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0247672Z self.run_subtests( 2022-11-23T02:58:20.0248033Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0248194Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0248617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0248779Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0249164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0249285Z output = model(*input) 2022-11-23T02:58:20.0249597Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0249740Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0250121Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0250344Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0250767Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0250889Z _lazy_init(state, module) 2022-11-23T02:58:20.0251249Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0251395Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0251722Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0251849Z return func(*args, **kwargs) 2022-11-23T02:58:20.0252233Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0252337Z p_assert( 2022-11-23T02:58:20.0252675Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0252808Z traceback.print_stack() 2022-11-23T02:58:20.0253051Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0253274Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0253408Z File "", line 1, in 2022-11-23T02:58:20.0253619Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0253761Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0253967Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0254121Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0254337Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0254441Z self.run() 2022-11-23T02:58:20.0254628Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0254775Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0255128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0255261Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0255633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0255761Z getattr(self, test_name)() 2022-11-23T02:58:20.0256130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0256228Z fn() 2022-11-23T02:58:20.0256580Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0256705Z test(self, **param_kwargs) 2022-11-23T02:58:20.0257066Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0257193Z return func(*args, **kwargs) 2022-11-23T02:58:20.0257459Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0257573Z self.run_subtests( 2022-11-23T02:58:20.0257981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0258154Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0258510Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0258666Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0259047Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0259168Z output = model(*input) 2022-11-23T02:58:20.0259501Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0259694Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0260078Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0260257Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0260616Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0260739Z _lazy_init(state, module) 2022-11-23T02:58:20.0261097Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0261243Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0261585Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0261712Z return func(*args, **kwargs) 2022-11-23T02:58:20.0262099Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0262206Z p_assert( 2022-11-23T02:58:20.0262532Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0262659Z traceback.print_stack() 2022-11-23T02:58:20.0262792Z File "", line 1, in 2022-11-23T02:58:20.0263004Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0263147Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0263354Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0263507Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0263706Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0263812Z self.run() 2022-11-23T02:58:20.0264017Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0264168Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0264515Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0264651Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0265023Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0265147Z getattr(self, test_name)() 2022-11-23T02:58:20.0265491Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0265590Z fn() 2022-11-23T02:58:20.0265961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0266085Z test(self, **param_kwargs) 2022-11-23T02:58:20.0266446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0266576Z return func(*args, **kwargs) 2022-11-23T02:58:20.0266837Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0266953Z self.run_subtests( 2022-11-23T02:58:20.0267339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0267508Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0267881Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0268036Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0268418Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0268539Z output = model(*input) 2022-11-23T02:58:20.0268869Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0269242Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0269616Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0269802Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0270178Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0270300Z _lazy_init(state, module) 2022-11-23T02:58:20.0270662Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0270805Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0271147Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0271271Z return func(*args, **kwargs) 2022-11-23T02:58:20.0271636Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0271743Z p_assert( 2022-11-23T02:58:20.0272085Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0272210Z traceback.print_stack() 2022-11-23T02:58:20.0272456Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0272700Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0272831Z File "", line 1, in 2022-11-23T02:58:20.0273047Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0273172Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0273379Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0273533Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0273754Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0273858Z self.run() 2022-11-23T02:58:20.0274066Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0274211Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0274542Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0274675Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0275041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0275166Z getattr(self, test_name)() 2022-11-23T02:58:20.0275532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0275629Z fn() 2022-11-23T02:58:20.0275998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0276125Z test(self, **param_kwargs) 2022-11-23T02:58:20.0276466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0276593Z return func(*args, **kwargs) 2022-11-23T02:58:20.0276923Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0277046Z self.run_subtests( 2022-11-23T02:58:20.0277409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0277572Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0277943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0278097Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0278458Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0278638Z output = model(*input) 2022-11-23T02:58:20.0278972Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0279118Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0279501Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0293296Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0293855Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0293994Z _lazy_init(state, module) 2022-11-23T02:58:20.0294371Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0294521Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0294887Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0295018Z return func(*args, **kwargs) 2022-11-23T02:58:20.0295418Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0295526Z p_assert( 2022-11-23T02:58:20.0295857Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0295989Z traceback.print_stack() 2022-11-23T02:58:20.0296126Z File "", line 1, in 2022-11-23T02:58:20.0296345Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0296494Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0296703Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0296859Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0297082Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0297170Z self.run() 2022-11-23T02:58:20.0297381Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0297536Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0297888Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0298028Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0298404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0298540Z getattr(self, test_name)() 2022-11-23T02:58:20.0298912Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0298997Z fn() 2022-11-23T02:58:20.0299373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0299505Z test(self, **param_kwargs) 2022-11-23T02:58:20.0299872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0300002Z return func(*args, **kwargs) 2022-11-23T02:58:20.0300367Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0300494Z self.run_subtests( 2022-11-23T02:58:20.0300847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0301016Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0301392Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0301552Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0301937Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0302134Z output = model(*input) 2022-11-23T02:58:20.0302583Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0302735Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0303134Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0303302Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0303680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0303807Z _lazy_init(state, module) 2022-11-23T02:58:20.0304167Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0304317Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0304669Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0304800Z return func(*args, **kwargs) 2022-11-23T02:58:20.0305193Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0305282Z p_assert( 2022-11-23T02:58:20.0305629Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0305760Z traceback.print_stack() 2022-11-23T02:58:20.0306007Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0306250Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0306384Z File "", line 1, in 2022-11-23T02:58:20.0306598Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0306731Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0306943Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0307098Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0307322Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0307429Z self.run() 2022-11-23T02:58:20.0307642Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0307792Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0308148Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0308268Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0308641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0308767Z getattr(self, test_name)() 2022-11-23T02:58:20.0309445Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0309555Z fn() 2022-11-23T02:58:20.0309941Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0310164Z test(self, **param_kwargs) 2022-11-23T02:58:20.0310543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0310654Z return func(*args, **kwargs) 2022-11-23T02:58:20.0310921Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0311039Z self.run_subtests( 2022-11-23T02:58:20.0311400Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0311566Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0312013Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0312171Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0312561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0312666Z output = model(*input) 2022-11-23T02:58:20.0313004Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0313150Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0313536Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0313717Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0314092Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0314221Z _lazy_init(state, module) 2022-11-23T02:58:20.0314580Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0314711Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0315061Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0315190Z return func(*args, **kwargs) 2022-11-23T02:58:20.0315578Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0315683Z p_assert( 2022-11-23T02:58:20.0316026Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0316160Z traceback.print_stack() 2022-11-23T02:58:20.0316297Z File "", line 1, in 2022-11-23T02:58:20.0316494Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0316648Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0316857Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0317013Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0317234Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0317341Z self.run() 2022-11-23T02:58:20.0317552Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0317685Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0318039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0318175Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0318551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0318679Z getattr(self, test_name)() 2022-11-23T02:58:20.0319051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0319153Z fn() 2022-11-23T02:58:20.0319670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0319784Z test(self, **param_kwargs) 2022-11-23T02:58:20.0320158Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0320286Z return func(*args, **kwargs) 2022-11-23T02:58:20.0320550Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0320668Z self.run_subtests( 2022-11-23T02:58:20.0321030Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0321197Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0321848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0322034Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0328630Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0328751Z output = model(*input) 2022-11-23T02:58:20.0329086Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0329222Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0329598Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0329773Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0330139Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0330250Z _lazy_init(state, module) 2022-11-23T02:58:20.0330603Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0330740Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0331082Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0331199Z return func(*args, **kwargs) 2022-11-23T02:58:20.0331576Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0331672Z p_assert( 2022-11-23T02:58:20.0332008Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0332119Z traceback.print_stack() 2022-11-23T02:58:20.0332354Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0332587Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0332709Z File "", line 1, in 2022-11-23T02:58:20.0332916Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0333064Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0333260Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0333405Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0333604Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0333701Z self.run() 2022-11-23T02:58:20.0333898Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0334037Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0334381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0334510Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0334870Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0334985Z getattr(self, test_name)() 2022-11-23T02:58:20.0335396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0335495Z fn() 2022-11-23T02:58:20.0335860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0335975Z test(self, **param_kwargs) 2022-11-23T02:58:20.0336327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0336446Z return func(*args, **kwargs) 2022-11-23T02:58:20.0336699Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0336847Z self.run_subtests( 2022-11-23T02:58:20.0358059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0358220Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0358590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0358737Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0359108Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0359221Z output = model(*input) 2022-11-23T02:58:20.0359543Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0359676Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0360040Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0360214Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0360579Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0360697Z _lazy_init(state, module) 2022-11-23T02:58:20.0361050Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0361186Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0361520Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0361637Z return func(*args, **kwargs) 2022-11-23T02:58:20.0362006Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0362102Z p_assert( 2022-11-23T02:58:20.0362435Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0362559Z traceback.print_stack() 2022-11-23T02:58:20.0362682Z File "", line 1, in 2022-11-23T02:58:20.0362886Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0363025Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0363213Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0363358Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0363564Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0363660Z self.run() 2022-11-23T02:58:20.0363856Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0363995Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0364333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0364465Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0364816Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0364932Z getattr(self, test_name)() 2022-11-23T02:58:20.0365398Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0365501Z fn() 2022-11-23T02:58:20.0365871Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0365987Z test(self, **param_kwargs) 2022-11-23T02:58:20.0366341Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0366458Z return func(*args, **kwargs) 2022-11-23T02:58:20.0366702Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0366874Z self.run_subtests( 2022-11-23T02:58:20.0367228Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0367384Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0367751Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0367898Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0368275Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0368387Z output = model(*input) 2022-11-23T02:58:20.0368702Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0368835Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0369209Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0369387Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0369751Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0369868Z _lazy_init(state, module) 2022-11-23T02:58:20.0370219Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0370356Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0370680Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0370797Z return func(*args, **kwargs) 2022-11-23T02:58:20.0371169Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0371263Z p_assert( 2022-11-23T02:58:20.0371597Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0371720Z traceback.print_stack() 2022-11-23T02:58:20.0371955Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0372189Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0372302Z File "", line 1, in 2022-11-23T02:58:20.0372506Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0372641Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0372837Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0372981Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0373189Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0373285Z self.run() 2022-11-23T02:58:20.0373472Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0373613Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0373952Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0374078Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0374486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0374609Z getattr(self, test_name)() 2022-11-23T02:58:20.0374971Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0375060Z fn() 2022-11-23T02:58:20.0375411Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0375526Z test(self, **param_kwargs) 2022-11-23T02:58:20.0375878Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0376094Z return func(*args, **kwargs) 2022-11-23T02:58:20.0376347Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0376453Z self.run_subtests( 2022-11-23T02:58:20.0376807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0376972Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0377327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0377483Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0377865Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0377987Z output = model(*input) 2022-11-23T02:58:20.0378317Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0378464Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0378847Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0379030Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0379385Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0379511Z _lazy_init(state, module) 2022-11-23T02:58:20.0379868Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0380015Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0380359Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0380485Z return func(*args, **kwargs) 2022-11-23T02:58:20.0380873Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0380977Z p_assert( 2022-11-23T02:58:20.0381304Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0381433Z traceback.print_stack() 2022-11-23T02:58:20.0381563Z File "", line 1, in 2022-11-23T02:58:20.0381776Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0381920Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0382127Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0382280Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0382497Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0382584Z self.run() 2022-11-23T02:58:20.0382791Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0382940Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0383286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0383470Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0383848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0383972Z getattr(self, test_name)() 2022-11-23T02:58:20.0384316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0384416Z fn() 2022-11-23T02:58:20.0384785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0384911Z test(self, **param_kwargs) 2022-11-23T02:58:20.0385276Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0385455Z return func(*args, **kwargs) 2022-11-23T02:58:20.0385720Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0385840Z self.run_subtests( 2022-11-23T02:58:20.0386189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0386354Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0386726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0386881Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0387262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0387382Z output = model(*input) 2022-11-23T02:58:20.0387719Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0387861Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0388227Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0388407Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0388782Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0388906Z _lazy_init(state, module) 2022-11-23T02:58:20.0389574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0389724Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0390068Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0390195Z return func(*args, **kwargs) 2022-11-23T02:58:20.0390589Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0390674Z p_assert( 2022-11-23T02:58:20.0391020Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0391149Z traceback.print_stack() 2022-11-23T02:58:20.0391389Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0391627Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0391758Z File "", line 1, in 2022-11-23T02:58:20.0391973Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0392101Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0392306Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0392464Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0392682Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0392787Z self.run() 2022-11-23T02:58:20.0392993Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0393212Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0393575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0393691Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0394060Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0394186Z getattr(self, test_name)() 2022-11-23T02:58:20.0394553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0394653Z fn() 2022-11-23T02:58:20.0395106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0395231Z test(self, **param_kwargs) 2022-11-23T02:58:20.0395597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0395709Z return func(*args, **kwargs) 2022-11-23T02:58:20.0395971Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0396087Z self.run_subtests( 2022-11-23T02:58:20.0396447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0396613Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0396983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0397137Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0397526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0397628Z output = model(*input) 2022-11-23T02:58:20.0397963Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0398107Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0398491Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0398668Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0399041Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0399163Z _lazy_init(state, module) 2022-11-23T02:58:20.0399522Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0399652Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0399998Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0400125Z return func(*args, **kwargs) 2022-11-23T02:58:20.0400516Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0400622Z p_assert( 2022-11-23T02:58:20.0400966Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0401095Z traceback.print_stack() 2022-11-23T02:58:20.0401224Z File "", line 1, in 2022-11-23T02:58:20.0401418Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0401561Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0401766Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0401923Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0402138Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0402244Z self.run() 2022-11-23T02:58:20.0402497Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0402635Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0402984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0403122Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0403488Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0403612Z getattr(self, test_name)() 2022-11-23T02:58:20.0403979Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0404079Z fn() 2022-11-23T02:58:20.0404503Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0404610Z test(self, **param_kwargs) 2022-11-23T02:58:20.0404979Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0405110Z return func(*args, **kwargs) 2022-11-23T02:58:20.0405370Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0405486Z self.run_subtests( 2022-11-23T02:58:20.0405844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0406010Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0406379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0406518Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0406902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0407025Z output = model(*input) 2022-11-23T02:58:20.0407364Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0407507Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0407891Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0408071Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0408443Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0408546Z _lazy_init(state, module) 2022-11-23T02:58:20.0408903Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0409053Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0409397Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0409524Z return func(*args, **kwargs) 2022-11-23T02:58:20.0409911Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0410017Z p_assert( 2022-11-23T02:58:20.0410360Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0410469Z traceback.print_stack() 2022-11-23T02:58:20.0410710Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0410951Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0411081Z File "", line 1, in 2022-11-23T02:58:20.0411299Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0411443Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0411647Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0411847Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0412054Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0412160Z self.run() 2022-11-23T02:58:20.0412367Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0412517Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0412869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0413004Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0413372Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0413527Z getattr(self, test_name)() 2022-11-23T02:58:20.0413895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0413994Z fn() 2022-11-23T02:58:20.0414371Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0414494Z test(self, **param_kwargs) 2022-11-23T02:58:20.0414860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0414987Z return func(*args, **kwargs) 2022-11-23T02:58:20.0415249Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0415345Z self.run_subtests( 2022-11-23T02:58:20.0415705Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0415871Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0416242Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0416398Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0416786Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0416906Z output = model(*input) 2022-11-23T02:58:20.0417237Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0417363Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0417745Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0417926Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0418298Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0418425Z _lazy_init(state, module) 2022-11-23T02:58:20.0418783Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0418932Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0419278Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0419386Z return func(*args, **kwargs) 2022-11-23T02:58:20.0419770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0419875Z p_assert( 2022-11-23T02:58:20.0420218Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0420347Z traceback.print_stack() 2022-11-23T02:58:20.0420479Z File "", line 1, in 2022-11-23T02:58:20.0420697Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0420842Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0421029Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0421230Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0421452Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0421558Z self.run() 2022-11-23T02:58:20.0421763Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0421911Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0422262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0422397Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0422745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0422920Z getattr(self, test_name)() 2022-11-23T02:58:20.0423287Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0423389Z fn() 2022-11-23T02:58:20.0423762Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0423884Z test(self, **param_kwargs) 2022-11-23T02:58:20.0424247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0424356Z return func(*args, **kwargs) 2022-11-23T02:58:20.0424616Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0424730Z self.run_subtests( 2022-11-23T02:58:20.0425089Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0425257Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0425628Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0425783Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0426167Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0426289Z output = model(*input) 2022-11-23T02:58:20.0426604Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0426746Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0427128Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0427307Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0427677Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0427803Z _lazy_init(state, module) 2022-11-23T02:58:20.0428163Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0428311Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0428636Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0428763Z return func(*args, **kwargs) 2022-11-23T02:58:20.0429333Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0429445Z p_assert( 2022-11-23T02:58:20.0429796Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0429925Z traceback.print_stack() 2022-11-23T02:58:20.0430168Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0430396Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0430527Z File "", line 1, in 2022-11-23T02:58:20.0430814Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0430966Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0431174Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0431328Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0431545Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0431650Z self.run() 2022-11-23T02:58:20.0431841Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0431989Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0432343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0432543Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0432915Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0433046Z getattr(self, test_name)() 2022-11-23T02:58:20.0433415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0433515Z fn() 2022-11-23T02:58:20.0433868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0433991Z test(self, **param_kwargs) 2022-11-23T02:58:20.0434351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0434478Z return func(*args, **kwargs) 2022-11-23T02:58:20.0434739Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0434858Z self.run_subtests( 2022-11-23T02:58:20.0435221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0435388Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0435743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0435898Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0436278Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0436400Z output = model(*input) 2022-11-23T02:58:20.0436731Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0436875Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0437261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0437441Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0437799Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0437921Z _lazy_init(state, module) 2022-11-23T02:58:20.0438277Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0438421Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0438762Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0438888Z return func(*args, **kwargs) 2022-11-23T02:58:20.0439271Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0439379Z p_assert( 2022-11-23T02:58:20.0439704Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0439833Z traceback.print_stack() 2022-11-23T02:58:20.0439963Z File "", line 1, in 2022-11-23T02:58:20.0440226Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0440376Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0440584Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0440738Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0440936Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0441040Z self.run() 2022-11-23T02:58:20.0441247Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0441395Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0441747Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0441933Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0442305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0442434Z getattr(self, test_name)() 2022-11-23T02:58:20.0442780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0442879Z fn() 2022-11-23T02:58:20.0443249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0443374Z test(self, **param_kwargs) 2022-11-23T02:58:20.0443738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0443866Z return func(*args, **kwargs) 2022-11-23T02:58:20.0444125Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0444244Z self.run_subtests( 2022-11-23T02:58:20.0444584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0444752Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0445121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0445276Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0445657Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0445779Z output = model(*input) 2022-11-23T02:58:20.0446108Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0446251Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0446622Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0446802Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0447175Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0447299Z _lazy_init(state, module) 2022-11-23T02:58:20.0447657Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0447801Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0448144Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0448269Z return func(*args, **kwargs) 2022-11-23T02:58:20.0448635Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0448742Z p_assert( 2022-11-23T02:58:20.0449084Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0449214Z traceback.print_stack() 2022-11-23T02:58:20.0449506Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0449753Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0449927Z File "", line 1, in 2022-11-23T02:58:20.0450146Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0450271Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0450474Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0450626Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0450844Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0451003Z self.run() 2022-11-23T02:58:20.0451211Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0451359Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0451694Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0451829Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0452201Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0452327Z getattr(self, test_name)() 2022-11-23T02:58:20.0452694Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0452794Z fn() 2022-11-23T02:58:20.0453166Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0453291Z test(self, **param_kwargs) 2022-11-23T02:58:20.0453643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0453768Z return func(*args, **kwargs) 2022-11-23T02:58:20.0454032Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0454147Z self.run_subtests( 2022-11-23T02:58:20.0454509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0454674Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0455045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0455202Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0455566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0455692Z output = model(*input) 2022-11-23T02:58:20.0456025Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0456168Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0456555Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0456734Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0457106Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0457229Z _lazy_init(state, module) 2022-11-23T02:58:20.0457586Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0457713Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0458057Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0458188Z return func(*args, **kwargs) 2022-11-23T02:58:20.0458576Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0458681Z p_assert( 2022-11-23T02:58:20.0459069Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0459203Z traceback.print_stack() 2022-11-23T02:58:20.0459315Z File "", line 1, in 2022-11-23T02:58:20.0459527Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0459674Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0459879Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0460032Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0460248Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0460415Z self.run() 2022-11-23T02:58:20.0460623Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0460753Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0461105Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0461243Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0461611Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0461737Z getattr(self, test_name)() 2022-11-23T02:58:20.0462104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0462205Z fn() 2022-11-23T02:58:20.0462579Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0462684Z test(self, **param_kwargs) 2022-11-23T02:58:20.0463051Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0463179Z return func(*args, **kwargs) 2022-11-23T02:58:20.0463443Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0463560Z self.run_subtests( 2022-11-23T02:58:20.0463921Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0464085Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0464454Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0464589Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0464972Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0465099Z output = model(*input) 2022-11-23T02:58:20.0465431Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0465574Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0465960Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0466140Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0466511Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0466614Z _lazy_init(state, module) 2022-11-23T02:58:20.0466970Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0467117Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0467459Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0467590Z return func(*args, **kwargs) 2022-11-23T02:58:20.0467977Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0468083Z p_assert( 2022-11-23T02:58:20.0468475Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0468590Z traceback.print_stack() 2022-11-23T02:58:20.0468834Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0469252Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0469393Z File "", line 1, in 2022-11-23T02:58:20.0469606Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0469750Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0469957Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0470168Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0470386Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0470490Z self.run() 2022-11-23T02:58:20.0470700Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0470848Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0471206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0471342Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0471709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0471816Z getattr(self, test_name)() 2022-11-23T02:58:20.0472181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0472286Z fn() 2022-11-23T02:58:20.0472659Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0472782Z test(self, **param_kwargs) 2022-11-23T02:58:20.0473149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0473278Z return func(*args, **kwargs) 2022-11-23T02:58:20.0473537Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0473633Z self.run_subtests( 2022-11-23T02:58:20.0473991Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0474156Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0474525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0474685Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0475068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0475189Z output = model(*input) 2022-11-23T02:58:20.0475523Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0475648Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0476033Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0476212Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0476586Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0476708Z _lazy_init(state, module) 2022-11-23T02:58:20.0477066Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0477216Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0477560Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0477727Z return func(*args, **kwargs) 2022-11-23T02:58:20.0478128Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0478234Z p_assert( 2022-11-23T02:58:20.0478577Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0478707Z traceback.print_stack() 2022-11-23T02:58:20.0478837Z File "", line 1, in 2022-11-23T02:58:20.0479050Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0479194Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0479433Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0479586Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0479808Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0479914Z self.run() 2022-11-23T02:58:20.0480123Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0480273Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0480626Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0480744Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0481113Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0481238Z getattr(self, test_name)() 2022-11-23T02:58:20.0481600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0481705Z fn() 2022-11-23T02:58:20.0482078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0482203Z test(self, **param_kwargs) 2022-11-23T02:58:20.0482570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0482680Z return func(*args, **kwargs) 2022-11-23T02:58:20.0482939Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0483053Z self.run_subtests( 2022-11-23T02:58:20.0483413Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0483577Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0483943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0484101Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0484481Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0484586Z output = model(*input) 2022-11-23T02:58:20.0484919Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0485061Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0485444Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0485623Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0485995Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0486119Z _lazy_init(state, module) 2022-11-23T02:58:20.0486481Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0486608Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0486953Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0487131Z return func(*args, **kwargs) 2022-11-23T02:58:20.0487526Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0487630Z p_assert( 2022-11-23T02:58:20.0487973Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0488105Z traceback.print_stack() 2022-11-23T02:58:20.0488351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0488573Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0488757Z File "", line 1, in 2022-11-23T02:58:20.0488973Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0489116Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0489327Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0489481Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0489698Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0489805Z self.run() 2022-11-23T02:58:20.0489992Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0490140Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0490490Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0490625Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0490995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0491122Z getattr(self, test_name)() 2022-11-23T02:58:20.0491489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0491592Z fn() 2022-11-23T02:58:20.0491948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0492071Z test(self, **param_kwargs) 2022-11-23T02:58:20.0492433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0492560Z return func(*args, **kwargs) 2022-11-23T02:58:20.0492823Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0492939Z self.run_subtests( 2022-11-23T02:58:20.0493294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0493445Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0493815Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0493973Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0494358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0494479Z output = model(*input) 2022-11-23T02:58:20.0494810Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0494952Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0495335Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0495518Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0495879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0496002Z _lazy_init(state, module) 2022-11-23T02:58:20.0496407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0496560Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0496908Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0497037Z return func(*args, **kwargs) 2022-11-23T02:58:20.0497421Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0497525Z p_assert( 2022-11-23T02:58:20.0497849Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0497979Z traceback.print_stack() 2022-11-23T02:58:20.0498159Z File "", line 1, in 2022-11-23T02:58:20.0498373Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0498515Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0498724Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0498877Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0499076Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0499179Z self.run() 2022-11-23T02:58:20.0499386Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0499535Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0499883Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0500017Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0500389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0500515Z getattr(self, test_name)() 2022-11-23T02:58:20.0500862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0500965Z fn() 2022-11-23T02:58:20.0501332Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0501456Z test(self, **param_kwargs) 2022-11-23T02:58:20.0501819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0501947Z return func(*args, **kwargs) 2022-11-23T02:58:20.0502207Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0502320Z self.run_subtests( 2022-11-23T02:58:20.0502662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0502829Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0503203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0503359Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0503741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0503863Z output = model(*input) 2022-11-23T02:58:20.0504194Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0504338Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0504704Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0504882Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0505260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0505384Z _lazy_init(state, module) 2022-11-23T02:58:20.0505785Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0505939Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0506286Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0506413Z return func(*args, **kwargs) 2022-11-23T02:58:20.0506777Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0506881Z p_assert( 2022-11-23T02:58:20.0507223Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0507398Z traceback.print_stack() 2022-11-23T02:58:20.0507643Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0507883Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0508124Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0508360Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0508576Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0508808Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0509327Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0509573Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0509809Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0510046Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0510278Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0510513Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0510728Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0510959Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0511190Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0511418Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0511646Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0511878Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0512109Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0512335Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0512570Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0512781Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0513010Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0513241Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0513466Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0513694Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0513927Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0514157Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0514456Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0514679Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0514908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0515139Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0515366Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0515594Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0515824Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0516116Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0516343Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0516559Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0516786Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0517015Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0517243Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0517472Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0517700Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0517929Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0518161Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0518392Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0518606Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0518836Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0518951Z dist init r=0, world=2 2022-11-23T02:58:20.0519297Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0519626Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0519945Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0520261Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0520573Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0520880Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0521186Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0521475Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0521784Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0522189Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0522504Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0522811Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0522927Z dist init r=1, world=2 2022-11-23T02:58:20.0523257Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0523651Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0523973Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0524287Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0524599Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0524893Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0525205Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0525515Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0525825Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0526132Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0526436Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0526742Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0526848Z ok (6.815s) 2022-11-23T02:58:20.0527220Z test_nested_always_wrap_model_offload_true_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91178 2022-11-23T02:58:20.0527445Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91179 2022-11-23T02:58:20.0527845Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.0528007Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.0528397Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.0528592Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.0528968Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.0529152Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.0529582Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.0529783Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.0530034Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.0530268Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.0530678Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.0531082Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.0531365Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.0531600Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.0531843Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0532082Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0533119Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.0533237Z warnings.warn( 2022-11-23T02:58:20.0534275Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.0534388Z warnings.warn( 2022-11-23T02:58:20.0534518Z File "", line 1, in 2022-11-23T02:58:20.0534717Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0534861Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0535071Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0535223Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0535441Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0535549Z self.run() 2022-11-23T02:58:20.0535755Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0535886Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0536016Z File "", line 1, in 2022-11-23T02:58:20.0536370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0536504Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0536873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0536999Z getattr(self, test_name)() 2022-11-23T02:58:20.0537215Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0537360Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0537706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0537809Z fn() 2022-11-23T02:58:20.0538013Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0538165Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0538592Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0538724Z test(self, **param_kwargs) 2022-11-23T02:58:20.0538941Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0539029Z self.run() 2022-11-23T02:58:20.0539397Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0539524Z return func(*args, **kwargs) 2022-11-23T02:58:20.0539730Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0539882Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0540196Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0540313Z self.run_subtests( 2022-11-23T02:58:20.0540667Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0540782Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0541142Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0541310Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0541676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0541805Z getattr(self, test_name)() 2022-11-23T02:58:20.0542173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0542327Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0542694Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0542776Z fn() 2022-11-23T02:58:20.0543165Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0543285Z output = model(*input) 2022-11-23T02:58:20.0543658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0543783Z test(self, **param_kwargs) 2022-11-23T02:58:20.0544111Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0544254Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0544620Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0544733Z return func(*args, **kwargs) 2022-11-23T02:58:20.0545113Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0545292Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0545556Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0545672Z self.run_subtests( 2022-11-23T02:58:20.0546049Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0546171Z _lazy_init(state, module) 2022-11-23T02:58:20.0546531Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0546676Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0547035Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0547185Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0547555Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0547759Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0548111Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0548239Z return func(*args, **kwargs) 2022-11-23T02:58:20.0548620Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0548724Z output = model(*input) 2022-11-23T02:58:20.0549284Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0549398Z p_assert( 2022-11-23T02:58:20.0549739Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0549996Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0550349Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0550478Z traceback.print_stack() 2022-11-23T02:58:20.0550864Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0551025Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0551399Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0551520Z _lazy_init(state, module) 2022-11-23T02:58:20.0551879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0552024Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0552368Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0552499Z return func(*args, **kwargs) 2022-11-23T02:58:20.0552887Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0552976Z p_assert( 2022-11-23T02:58:20.0553321Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0553448Z traceback.print_stack() 2022-11-23T02:58:20.0553690Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0553934Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0554067Z File "", line 1, in 2022-11-23T02:58:20.0554279Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0554423Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0554614Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0554767Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0554985Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0555093Z self.run() 2022-11-23T02:58:20.0555300Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0555448Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0555797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0555931Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0556281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0556407Z getattr(self, test_name)() 2022-11-23T02:58:20.0556773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0556877Z fn() 2022-11-23T02:58:20.0557248Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0557372Z test(self, **param_kwargs) 2022-11-23T02:58:20.0557801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0557918Z return func(*args, **kwargs) 2022-11-23T02:58:20.0558181Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0558297Z self.run_subtests( 2022-11-23T02:58:20.0558662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0558828Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0559199Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0559404Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0559788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0559915Z output = model(*input) 2022-11-23T02:58:20.0560231Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0560375Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0560757Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0560938Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0561308Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0561431Z _lazy_init(state, module) 2022-11-23T02:58:20.0561791Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0561936Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0562266Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0562393Z return func(*args, **kwargs) 2022-11-23T02:58:20.0562777Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0562881Z p_assert( 2022-11-23T02:58:20.0563223Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0563351Z traceback.print_stack() 2022-11-23T02:58:20.0563482Z File "", line 1, in 2022-11-23T02:58:20.0563679Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0563820Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0564031Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0564184Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0564402Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0564507Z self.run() 2022-11-23T02:58:20.0564714Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0564861Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0565190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0565323Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0565690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0565815Z getattr(self, test_name)() 2022-11-23T02:58:20.0566181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0566283Z fn() 2022-11-23T02:58:20.0566654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0566823Z test(self, **param_kwargs) 2022-11-23T02:58:20.0567177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0567305Z return func(*args, **kwargs) 2022-11-23T02:58:20.0567567Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0567682Z self.run_subtests( 2022-11-23T02:58:20.0568042Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0568206Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0568575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0568782Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0569153Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0569274Z output = model(*input) 2022-11-23T02:58:20.0569607Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0569752Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0570136Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0570315Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0570686Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0570811Z _lazy_init(state, module) 2022-11-23T02:58:20.0571153Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0571296Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0571646Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0571773Z return func(*args, **kwargs) 2022-11-23T02:58:20.0572158Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0572263Z p_assert( 2022-11-23T02:58:20.0572606Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0572733Z traceback.print_stack() 2022-11-23T02:58:20.0572957Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0573194Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0573331Z File "", line 1, in 2022-11-23T02:58:20.0573547Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0573689Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0573896Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0574050Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0574248Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0574353Z self.run() 2022-11-23T02:58:20.0574559Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0574709Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0575058Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0575193Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0575565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0575691Z getattr(self, test_name)() 2022-11-23T02:58:20.0576088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0576193Z fn() 2022-11-23T02:58:20.0576569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0576699Z test(self, **param_kwargs) 2022-11-23T02:58:20.0577061Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0577188Z return func(*args, **kwargs) 2022-11-23T02:58:20.0577448Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0577564Z self.run_subtests( 2022-11-23T02:58:20.0577957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0578120Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0578495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0578649Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0579032Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0579152Z output = model(*input) 2022-11-23T02:58:20.0579281Z File "", line 1, in 2022-11-23T02:58:20.0579612Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0579736Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0580118Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0580300Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0580514Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0580657Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0581032Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0581155Z _lazy_init(state, module) 2022-11-23T02:58:20.0581361Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0581496Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0581854Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0581999Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0582214Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0582324Z self.run() 2022-11-23T02:58:20.0582671Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0582798Z return func(*args, **kwargs) 2022-11-23T02:58:20.0583008Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0583138Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0583521Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0583627Z p_assert( 2022-11-23T02:58:20.0583972Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0584106Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0584445Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0584573Z traceback.print_stack() 2022-11-23T02:58:20.0584927Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0585053Z getattr(self, test_name)() 2022-11-23T02:58:20.0585463Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0585568Z fn() 2022-11-23T02:58:20.0585941Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0586066Z test(self, **param_kwargs) 2022-11-23T02:58:20.0586426Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0586553Z return func(*args, **kwargs) 2022-11-23T02:58:20.0586795Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0586909Z self.run_subtests( 2022-11-23T02:58:20.0587335Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0587500Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0587875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0588029Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0588413Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0588534Z output = model(*input) 2022-11-23T02:58:20.0588848Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0589178Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0589579Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0589765Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0590140Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0590263Z _lazy_init(state, module) 2022-11-23T02:58:20.0590627Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0590773Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0591098Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0591225Z return func(*args, **kwargs) 2022-11-23T02:58:20.0591610Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0591715Z p_assert( 2022-11-23T02:58:20.0592058Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0592190Z traceback.print_stack() 2022-11-23T02:58:20.0592432Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0592675Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0592791Z File "", line 1, in 2022-11-23T02:58:20.0593004Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0593149Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0593354Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0593507Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0593725Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0593832Z self.run() 2022-11-23T02:58:20.0594037Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0594170Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0594518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0594654Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0595088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0595225Z getattr(self, test_name)() 2022-11-23T02:58:20.0595597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0595698Z fn() 2022-11-23T02:58:20.0596068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0596174Z test(self, **param_kwargs) 2022-11-23T02:58:20.0596535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0596725Z return func(*args, **kwargs) 2022-11-23T02:58:20.0596990Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0597107Z self.run_subtests( 2022-11-23T02:58:20.0597476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0597638Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0598009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0598146Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0598529Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0598654Z output = model(*input) 2022-11-23T02:58:20.0598985Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0599133Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0599514Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0599694Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0600070Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0600175Z _lazy_init(state, module) 2022-11-23T02:58:20.0600531Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0600677Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0601019Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0601145Z return func(*args, **kwargs) 2022-11-23T02:58:20.0601530Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0601639Z p_assert( 2022-11-23T02:58:20.0601984Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0602096Z traceback.print_stack() 2022-11-23T02:58:20.0602227Z File "", line 1, in 2022-11-23T02:58:20.0602439Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0602582Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0602787Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0602941Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0603155Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0603241Z self.run() 2022-11-23T02:58:20.0603446Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0603597Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0603943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0604078Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0604495Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0604625Z getattr(self, test_name)() 2022-11-23T02:58:20.0604993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0605075Z fn() 2022-11-23T02:58:20.0605444Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0605569Z test(self, **param_kwargs) 2022-11-23T02:58:20.0605931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0606107Z return func(*args, **kwargs) 2022-11-23T02:58:20.0606368Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0606484Z self.run_subtests( 2022-11-23T02:58:20.0606849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0606993Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0607365Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0607522Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0607902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0608027Z output = model(*input) 2022-11-23T02:58:20.0608360Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0608508Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0608890Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0609053Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0609429Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0609553Z _lazy_init(state, module) 2022-11-23T02:58:20.0609911Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0610058Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0610402Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0610531Z return func(*args, **kwargs) 2022-11-23T02:58:20.0610925Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0611011Z p_assert( 2022-11-23T02:58:20.0611353Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0611485Z traceback.print_stack() 2022-11-23T02:58:20.0611729Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0611968Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0612098Z File "", line 1, in 2022-11-23T02:58:20.0612310Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0612452Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0612640Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0612794Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0613014Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0613119Z self.run() 2022-11-23T02:58:20.0613325Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0613520Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0613882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0613999Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0614366Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0614494Z getattr(self, test_name)() 2022-11-23T02:58:20.0614858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0614958Z fn() 2022-11-23T02:58:20.0615328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0615501Z test(self, **param_kwargs) 2022-11-23T02:58:20.0615868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0615980Z return func(*args, **kwargs) 2022-11-23T02:58:20.0616240Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0616356Z self.run_subtests( 2022-11-23T02:58:20.0616716Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0616880Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0617009Z File "", line 1, in 2022-11-23T02:58:20.0617375Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0617530Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0617903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0618024Z output = model(*input) 2022-11-23T02:58:20.0618240Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0618384Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0618717Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0618862Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0619066Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0619217Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0619581Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0619759Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0619978Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0620083Z self.run() 2022-11-23T02:58:20.0620458Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0620581Z _lazy_init(state, module) 2022-11-23T02:58:20.0620788Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0620934Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0621273Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0621418Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0621758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0621892Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0622243Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0622370Z return func(*args, **kwargs) 2022-11-23T02:58:20.0622738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0622912Z getattr(self, test_name)() 2022-11-23T02:58:20.0623289Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0623394Z p_assert( 2022-11-23T02:58:20.0623758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0623858Z fn() 2022-11-23T02:58:20.0624201Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0624331Z traceback.print_stack() 2022-11-23T02:58:20.0624703Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0624859Z test(self, **param_kwargs) 2022-11-23T02:58:20.0625225Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0625355Z return func(*args, **kwargs) 2022-11-23T02:58:20.0625617Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0625731Z self.run_subtests( 2022-11-23T02:58:20.0626089Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0626254Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0626624Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0626760Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0627149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0627270Z output = model(*input) 2022-11-23T02:58:20.0627605Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0627748Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0628130Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0628310Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0628685Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0628807Z _lazy_init(state, module) 2022-11-23T02:58:20.0629316Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0629473Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0629827Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0629956Z return func(*args, **kwargs) 2022-11-23T02:58:20.0630343Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0630448Z p_assert( 2022-11-23T02:58:20.0630791Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0630901Z traceback.print_stack() 2022-11-23T02:58:20.0631143Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0631385Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0631516Z File "", line 1, in 2022-11-23T02:58:20.0631730Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0631879Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0632083Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0632237Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0632506Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0632619Z self.run() 2022-11-23T02:58:20.0632826Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0632978Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0633329Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0633463Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0633830Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0633955Z getattr(self, test_name)() 2022-11-23T02:58:20.0634377Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0634478Z fn() 2022-11-23T02:58:20.0634853Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0634977Z test(self, **param_kwargs) 2022-11-23T02:58:20.0635340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0635467Z return func(*args, **kwargs) 2022-11-23T02:58:20.0635729Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0635845Z self.run_subtests( 2022-11-23T02:58:20.0636186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0636351Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0636724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0636880Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0637265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0637386Z output = model(*input) 2022-11-23T02:58:20.0637718Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0637861Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0638229Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0638408Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0638778Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0638905Z _lazy_init(state, module) 2022-11-23T02:58:20.0639262Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0639406Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0639753Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0639881Z return func(*args, **kwargs) 2022-11-23T02:58:20.0640249Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0640355Z p_assert( 2022-11-23T02:58:20.0640697Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0640825Z traceback.print_stack() 2022-11-23T02:58:20.0640957Z File "", line 1, in 2022-11-23T02:58:20.0641170Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0641318Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0641508Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0641662Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0641928Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0642040Z self.run() 2022-11-23T02:58:20.0642248Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0642395Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0642743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0642878Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0643227Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0643400Z getattr(self, test_name)() 2022-11-23T02:58:20.0643774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0643873Z fn() 2022-11-23T02:58:20.0644248Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0644371Z test(self, **param_kwargs) 2022-11-23T02:58:20.0644731Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0644858Z return func(*args, **kwargs) 2022-11-23T02:58:20.0645101Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0645215Z self.run_subtests( 2022-11-23T02:58:20.0645577Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0645740Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0646118Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0646270Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0646653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0646775Z output = model(*input) 2022-11-23T02:58:20.0647091Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0647233Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0647613Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0647790Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0648161Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0648287Z _lazy_init(state, module) 2022-11-23T02:58:20.0648642Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0648788Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0649117Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0649243Z return func(*args, **kwargs) 2022-11-23T02:58:20.0649627Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0649731Z p_assert( 2022-11-23T02:58:20.0650116Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0650248Z traceback.print_stack() 2022-11-23T02:58:20.0650490Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0650735Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0650847Z File "", line 1, in 2022-11-23T02:58:20.0651061Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0651254Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0651465Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0651618Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0651836Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0651944Z self.run() 2022-11-23T02:58:20.0652135Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0652284Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0652644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0652842Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0653215Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0653340Z getattr(self, test_name)() 2022-11-23T02:58:20.0653709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0653811Z fn() 2022-11-23T02:58:20.0654166Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0654290Z test(self, **param_kwargs) 2022-11-23T02:58:20.0654650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0654777Z return func(*args, **kwargs) 2022-11-23T02:58:20.0654910Z File "", line 1, in 2022-11-23T02:58:20.0655170Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0655289Z self.run_subtests( 2022-11-23T02:58:20.0655648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0655797Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0656013Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0656156Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0656522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0656677Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0656883Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0657036Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0657461Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0657575Z output = model(*input) 2022-11-23T02:58:20.0657794Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0657900Z self.run() 2022-11-23T02:58:20.0658244Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0658386Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0658593Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0658741Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0659127Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0659287Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0659631Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0659768Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0660140Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0660263Z _lazy_init(state, module) 2022-11-23T02:58:20.0660679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0660808Z getattr(self, test_name)() 2022-11-23T02:58:20.0661169Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0661295Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0661658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0661758Z fn() 2022-11-23T02:58:20.0662097Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0662324Z return func(*args, **kwargs) 2022-11-23T02:58:20.0662696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0662819Z test(self, **param_kwargs) 2022-11-23T02:58:20.0663208Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0663294Z p_assert( 2022-11-23T02:58:20.0663662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0663789Z return func(*args, **kwargs) 2022-11-23T02:58:20.0664130Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0664257Z traceback.print_stack() 2022-11-23T02:58:20.0664520Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0664640Z self.run_subtests( 2022-11-23T02:58:20.0664983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0665147Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0665519Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0665675Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0666054Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0666172Z output = model(*input) 2022-11-23T02:58:20.0666503Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0666644Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0667026Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0667192Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0667565Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0667686Z _lazy_init(state, module) 2022-11-23T02:58:20.0668043Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0668188Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0668530Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0668654Z return func(*args, **kwargs) 2022-11-23T02:58:20.0669210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0669303Z p_assert( 2022-11-23T02:58:20.0669658Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0669789Z traceback.print_stack() 2022-11-23T02:58:20.0670030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0670341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0670480Z File "", line 1, in 2022-11-23T02:58:20.0670694Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0670820Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0671026Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0671180Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0671394Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0671498Z self.run() 2022-11-23T02:58:20.0671768Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0671917Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0672274Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0672393Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0672762Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0672888Z getattr(self, test_name)() 2022-11-23T02:58:20.0673257Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0673358Z fn() 2022-11-23T02:58:20.0673725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0673848Z test(self, **param_kwargs) 2022-11-23T02:58:20.0674211Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0674324Z return func(*args, **kwargs) 2022-11-23T02:58:20.0674585Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0674704Z self.run_subtests( 2022-11-23T02:58:20.0675065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0675228Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0675595Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0675749Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0676130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0676234Z output = model(*input) 2022-11-23T02:58:20.0676576Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0676721Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0677106Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0677286Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0677659Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0677780Z _lazy_init(state, module) 2022-11-23T02:58:20.0678137Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0678265Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0678608Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0678740Z return func(*args, **kwargs) 2022-11-23T02:58:20.0679126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0679231Z p_assert( 2022-11-23T02:58:20.0679624Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0679757Z traceback.print_stack() 2022-11-23T02:58:20.0679890Z File "", line 1, in 2022-11-23T02:58:20.0680087Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0680229Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0680436Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0680589Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0680805Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0680909Z self.run() 2022-11-23T02:58:20.0681177Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0681307Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0681651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0681787Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0682152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0682275Z getattr(self, test_name)() 2022-11-23T02:58:20.0682636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0682737Z fn() 2022-11-23T02:58:20.0683105Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0683210Z test(self, **param_kwargs) 2022-11-23T02:58:20.0683571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0683701Z return func(*args, **kwargs) 2022-11-23T02:58:20.0683960Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0684076Z self.run_subtests( 2022-11-23T02:58:20.0684437Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0684600Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0684964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0685102Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0685481Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0685602Z output = model(*input) 2022-11-23T02:58:20.0685935Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0686078Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0686463Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0686641Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0687015Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0687120Z _lazy_init(state, module) 2022-11-23T02:58:20.0687473Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0687618Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0687961Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0688092Z return func(*args, **kwargs) 2022-11-23T02:58:20.0688481Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0688589Z p_assert( 2022-11-23T02:58:20.0688977Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0689094Z traceback.print_stack() 2022-11-23T02:58:20.0689335Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0689574Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0689702Z File "", line 1, in 2022-11-23T02:58:20.0689913Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0690056Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0690261Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0690458Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0690655Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0690758Z self.run() 2022-11-23T02:58:20.0690963Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0691109Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0691459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0691592Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0691956Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0692079Z getattr(self, test_name)() 2022-11-23T02:58:20.0692424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0692527Z fn() 2022-11-23T02:58:20.0692894Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0693018Z test(self, **param_kwargs) 2022-11-23T02:58:20.0693377Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0693500Z return func(*args, **kwargs) 2022-11-23T02:58:20.0693756Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0693853Z self.run_subtests( 2022-11-23T02:58:20.0693981Z File "", line 1, in 2022-11-23T02:58:20.0694340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0694503Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0694874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0695033Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0695244Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0695388Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0695753Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0695873Z output = model(*input) 2022-11-23T02:58:20.0696077Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0696228Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0696557Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0696699Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0696912Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0697019Z self.run() 2022-11-23T02:58:20.0697387Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0697565Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0697821Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0697973Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0698345Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0698469Z _lazy_init(state, module) 2022-11-23T02:58:20.0698808Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0698943Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0699282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0699475Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0699842Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0699966Z getattr(self, test_name)() 2022-11-23T02:58:20.0700311Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0700436Z return func(*args, **kwargs) 2022-11-23T02:58:20.0700800Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0700897Z fn() 2022-11-23T02:58:20.0701264Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0701367Z p_assert( 2022-11-23T02:58:20.0701734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0701865Z test(self, **param_kwargs) 2022-11-23T02:58:20.0702209Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0702336Z traceback.print_stack() 2022-11-23T02:58:20.0702701Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0702826Z return func(*args, **kwargs) 2022-11-23T02:58:20.0703069Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0703182Z self.run_subtests( 2022-11-23T02:58:20.0703540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0703700Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0704068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0704224Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0704603Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0704724Z output = model(*input) 2022-11-23T02:58:20.0705040Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0705183Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0705563Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0705743Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0706115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0706237Z _lazy_init(state, module) 2022-11-23T02:58:20.0706592Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0706739Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0707066Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0707250Z return func(*args, **kwargs) 2022-11-23T02:58:20.0707646Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0707749Z p_assert( 2022-11-23T02:58:20.0708089Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0708217Z traceback.print_stack() 2022-11-23T02:58:20.0708459Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0708698Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0708810Z File "", line 1, in 2022-11-23T02:58:20.0709268Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0709420Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0709627Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0709784Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0710003Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0710106Z self.run() 2022-11-23T02:58:20.0710295Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0710442Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0710793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0710927Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0711294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0711422Z getattr(self, test_name)() 2022-11-23T02:58:20.0711789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0711888Z fn() 2022-11-23T02:58:20.0712242Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0712367Z test(self, **param_kwargs) 2022-11-23T02:58:20.0712727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0712850Z return func(*args, **kwargs) 2022-11-23T02:58:20.0713108Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0713224Z self.run_subtests( 2022-11-23T02:58:20.0713579Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0713748Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0714101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0714260Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0714643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0714763Z output = model(*input) 2022-11-23T02:58:20.0715092Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0715236Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0715619Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0715797Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0716160Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0716281Z _lazy_init(state, module) 2022-11-23T02:58:20.0716637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0716853Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0717205Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0717330Z return func(*args, **kwargs) 2022-11-23T02:58:20.0717714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0717818Z p_assert( 2022-11-23T02:58:20.0718143Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0718271Z traceback.print_stack() 2022-11-23T02:58:20.0718401Z File "", line 1, in 2022-11-23T02:58:20.0718688Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0718832Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0719038Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0719196Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0719414Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0719501Z self.run() 2022-11-23T02:58:20.0719706Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0719853Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0720202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0720338Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0720706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0720832Z getattr(self, test_name)() 2022-11-23T02:58:20.0721179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0721277Z fn() 2022-11-23T02:58:20.0721649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0721773Z test(self, **param_kwargs) 2022-11-23T02:58:20.0722133Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0722257Z return func(*args, **kwargs) 2022-11-23T02:58:20.0722521Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0722636Z self.run_subtests( 2022-11-23T02:58:20.0722977Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0723145Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0723518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0723677Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0724055Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0724175Z output = model(*input) 2022-11-23T02:58:20.0724506Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0724647Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0725011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0725189Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0725567Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0725686Z _lazy_init(state, module) 2022-11-23T02:58:20.0726087Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0726237Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0726580Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0726706Z return func(*args, **kwargs) 2022-11-23T02:58:20.0727074Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0727175Z p_assert( 2022-11-23T02:58:20.0727513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0727640Z traceback.print_stack() 2022-11-23T02:58:20.0727933Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0728171Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0728304Z File "", line 1, in 2022-11-23T02:58:20.0728522Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0728647Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0728848Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0729000Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0729216Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0729319Z self.run() 2022-11-23T02:58:20.0729447Z File "", line 1, in 2022-11-23T02:58:20.0729650Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0729798Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0730135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0730267Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0730473Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0730619Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0730991Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0731116Z getattr(self, test_name)() 2022-11-23T02:58:20.0731320Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0731471Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0731836Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0731935Z fn() 2022-11-23T02:58:20.0732139Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0732243Z self.run() 2022-11-23T02:58:20.0732617Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0732745Z test(self, **param_kwargs) 2022-11-23T02:58:20.0732955Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0733103Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0733469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0733594Z return func(*args, **kwargs) 2022-11-23T02:58:20.0733917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0734049Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0734310Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0734428Z self.run_subtests( 2022-11-23T02:58:20.0734793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0734917Z getattr(self, test_name)() 2022-11-23T02:58:20.0735319Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0735472Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0735838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0735938Z fn() 2022-11-23T02:58:20.0736305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0736455Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0736818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0736989Z test(self, **param_kwargs) 2022-11-23T02:58:20.0737370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0737475Z output = model(*input) 2022-11-23T02:58:20.0737838Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0737962Z return func(*args, **kwargs) 2022-11-23T02:58:20.0738290Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0738433Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0738690Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0738805Z self.run_subtests( 2022-11-23T02:58:20.0739184Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0739349Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0739709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0739872Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0740244Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0740363Z _lazy_init(state, module) 2022-11-23T02:58:20.0740729Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0740881Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0741234Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0741383Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0741746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0741866Z output = model(*input) 2022-11-23T02:58:20.0742211Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0742335Z return func(*args, **kwargs) 2022-11-23T02:58:20.0742667Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0742808Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0743188Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0743290Z p_assert( 2022-11-23T02:58:20.0743655Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0743835Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0744176Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0744302Z traceback.print_stack() 2022-11-23T02:58:20.0744721Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0744848Z _lazy_init(state, module) 2022-11-23T02:58:20.0745208Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0745351Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0745673Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0745803Z return func(*args, **kwargs) 2022-11-23T02:58:20.0746188Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0746338Z p_assert( 2022-11-23T02:58:20.0746680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0746806Z traceback.print_stack() 2022-11-23T02:58:20.0747052Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0747271Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0747406Z File "", line 1, in 2022-11-23T02:58:20.0747618Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0747760Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0747963Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0748115Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0748328Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0748435Z self.run() 2022-11-23T02:58:20.0748622Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0748767Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0749289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0749431Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0749806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0749933Z getattr(self, test_name)() 2022-11-23T02:58:20.0750331Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0750430Z fn() 2022-11-23T02:58:20.0750784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0750908Z test(self, **param_kwargs) 2022-11-23T02:58:20.0751277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0751403Z return func(*args, **kwargs) 2022-11-23T02:58:20.0751669Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0751786Z self.run_subtests( 2022-11-23T02:58:20.0752147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0752311Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0752663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0752819Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0753199Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0753319Z output = model(*input) 2022-11-23T02:58:20.0753650Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0753791Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0754246Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0754435Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0754794Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0754917Z _lazy_init(state, module) 2022-11-23T02:58:20.0755274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0755417Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0755760Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0755946Z return func(*args, **kwargs) 2022-11-23T02:58:20.0756334Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0756437Z p_assert( 2022-11-23T02:58:20.0756768Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0756897Z traceback.print_stack() 2022-11-23T02:58:20.0757027Z File "", line 1, in 2022-11-23T02:58:20.0757239Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0757380Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0757581Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0757733Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0757930Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0758037Z self.run() 2022-11-23T02:58:20.0758241Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0758388Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0758742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0758876Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0759243Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0759366Z getattr(self, test_name)() 2022-11-23T02:58:20.0759713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0759813Z fn() 2022-11-23T02:58:20.0760181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0760310Z test(self, **param_kwargs) 2022-11-23T02:58:20.0760672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0760797Z return func(*args, **kwargs) 2022-11-23T02:58:20.0761061Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 150, in test_nested_always_wrap_model 2022-11-23T02:58:20.0761175Z self.run_subtests( 2022-11-23T02:58:20.0761518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0761683Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0762053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0762206Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0762584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0762709Z output = model(*input) 2022-11-23T02:58:20.0763040Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0763180Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0763590Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0763774Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0764153Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0764280Z _lazy_init(state, module) 2022-11-23T02:58:20.0764635Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 93, in _lazy_init 2022-11-23T02:58:20.0764780Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0765122Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0765298Z return func(*args, **kwargs) 2022-11-23T02:58:20.0765668Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0765777Z p_assert( 2022-11-23T02:58:20.0766119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0766245Z traceback.print_stack() 2022-11-23T02:58:20.0766488Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0766722Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0766960Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0767195Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0767417Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0767647Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0767878Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0768111Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0768338Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0768568Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0768798Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0769026Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0769253Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0769469Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0769698Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0769931Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0770160Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0770388Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0770613Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0770844Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0771071Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0771283Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0771513Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0771742Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0772018Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0772252Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0772480Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0772708Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0772938Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0773149Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0773376Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0773649Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0773881Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0774111Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0774341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0774570Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0774797Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0775025Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0775234Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0775464Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0775691Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0775923Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0776149Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0776377Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0776602Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0776829Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0777039Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0777267Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0777381Z dist init r=1, world=2 2022-11-23T02:58:20.0777720Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0778050Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0778368Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0778683Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0778997Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0779333Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0779701Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0780006Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0780316Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0780624Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0780931Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0781299Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.0781414Z dist init r=0, world=2 2022-11-23T02:58:20.0781730Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0782043Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0782353Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0782658Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0782972Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0783282Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0783572Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0783879Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0784186Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0784498Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0784808Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0785115Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.0785218Z ok (6.815s) 2022-11-23T02:58:20.0785560Z test_nested_wrapped_model_offload_false_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91261 2022-11-23T02:58:20.0785783Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91262 2022-11-23T02:58:20.0786174Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.0786341Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.0786778Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.0786977Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.0787355Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.0787531Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.0787914Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.0788108Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.0788358Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.0788654Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.0789209Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.0789634Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.0789867Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.0790101Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.0790338Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0790572Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0791609Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.0791731Z warnings.warn( 2022-11-23T02:58:20.0792762Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.0792873Z warnings.warn( 2022-11-23T02:58:20.0793105Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0793328Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0793565Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0793805Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0794035Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0794266Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0794497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0794729Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0794961Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0795178Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0795409Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0795708Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0795943Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0796174Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0796402Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0796631Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0796858Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0797086Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0797357Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0797589Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0797819Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0798049Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0798278Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0798508Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0798735Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0798961Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0799170Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0799404Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0799632Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0799861Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0800090Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0800316Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0800543Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0800768Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0800978Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0801205Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0801436Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0801662Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0801895Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0802121Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0802349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0802576Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0803352Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.0804204Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.0804446Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0804660Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0804893Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0805127Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0805241Z dist init r=1, world=2 2022-11-23T02:58:20.0805352Z dist init r=0, world=2 2022-11-23T02:58:20.0805499Z ok (5.714s) 2022-11-23T02:58:20.0805839Z test_nested_wrapped_model_offload_false_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91344 2022-11-23T02:58:20.0806043Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91345 2022-11-23T02:58:20.0806431Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.0806613Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.0807000Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.0807195Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.0807568Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.0807745Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.0808137Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.0808327Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.0808561Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.0808810Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.0809217Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.0809619Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.0809851Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.0810082Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.0810323Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0810561Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0811599Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.0811713Z warnings.warn( 2022-11-23T02:58:20.0812739Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.0812883Z warnings.warn( 2022-11-23T02:58:20.0813128Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0813363Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0813599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0813833Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0814064Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0814292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0814571Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0814798Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0815014Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0815246Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0815474Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0815701Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0815932Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0816162Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0816388Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0816620Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0816831Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0817062Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0817287Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0817517Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0817741Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0817969Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0818196Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0818423Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0818637Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0818866Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0819092Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0819319Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0819548Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0819775Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0820002Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0820229Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0820461Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0820673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0820945Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0821180Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0821409Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0821638Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0821868Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0822095Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0822324Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0822582Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0822810Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0823039Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0823267Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0823496Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0823610Z dist init r=1, world=2 2022-11-23T02:58:20.0823720Z dist init r=0, world=2 2022-11-23T02:58:20.0823819Z ok (5.913s) 2022-11-23T02:58:20.0824151Z test_nested_wrapped_model_offload_false_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91427 2022-11-23T02:58:20.0824371Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91428 2022-11-23T02:58:20.0824761Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.0824939Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.0825327Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.0825522Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.0825895Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.0826072Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.0826439Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.0826633Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.0826886Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.0827133Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.0827544Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.0827946Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.0828181Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.0828411Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.0828651Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0828870Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0830159Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.0830289Z warnings.warn( 2022-11-23T02:58:20.0831327Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.0831497Z warnings.warn( 2022-11-23T02:58:20.0831732Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0831964Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0832203Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0832439Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0832673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0832902Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0833117Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0833349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0833578Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0833810Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0834042Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0834275Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0834507Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0834735Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0834951Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0835181Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0835405Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0835642Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0835874Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0836108Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0836334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0836561Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0836788Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0837001Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0837230Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0837454Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0837687Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0837915Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0838186Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0838426Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0838651Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0838862Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0839090Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0839318Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0839546Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0839833Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0840060Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0840292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0840522Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0840751Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0840959Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0841233Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0841459Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0841689Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0841922Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0842150Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0842266Z dist init r=1, world=2 2022-11-23T02:58:20.0842378Z dist init r=0, world=2 2022-11-23T02:58:20.0842462Z ok (5.913s) 2022-11-23T02:58:20.0842804Z test_nested_wrapped_model_offload_true_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91510 2022-11-23T02:58:20.0843024Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91511 2022-11-23T02:58:20.0843414Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.0843593Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.0843986Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.0844182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.0844558Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.0844718Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.0845102Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.0845296Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.0845544Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.0845791Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.0846192Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.0846599Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.0846881Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.0847119Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.0847339Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0847575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0848609Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.0848774Z warnings.warn( 2022-11-23T02:58:20.0849801Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.0849919Z warnings.warn( 2022-11-23T02:58:20.0850090Z File "", line 1, in 2022-11-23T02:58:20.0850308Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0850453Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0850666Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0850801Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0851018Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0851125Z self.run() 2022-11-23T02:58:20.0851331Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0851477Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0851831Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0851969Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0852340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0852446Z getattr(self, test_name)() 2022-11-23T02:58:20.0852813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0852914Z fn() 2022-11-23T02:58:20.0853290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0853414Z test(self, **param_kwargs) 2022-11-23T02:58:20.0853778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0853904Z return func(*args, **kwargs) 2022-11-23T02:58:20.0854160Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0854256Z self.run_subtests( 2022-11-23T02:58:20.0854615Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0854781Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0855152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0855308Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0855691Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0855862Z output = model(*input) 2022-11-23T02:58:20.0856204Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0856330Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0856713Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0856890Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0857264Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0857386Z _lazy_init(state, module) 2022-11-23T02:58:20.0857795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.0857945Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0858291Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0858401Z return func(*args, **kwargs) 2022-11-23T02:58:20.0858789Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0858893Z p_assert( 2022-11-23T02:58:20.0859240Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0859371Z traceback.print_stack() 2022-11-23T02:58:20.0859501Z File "", line 1, in 2022-11-23T02:58:20.0859717Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0859860Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0860054Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0860205Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0860420Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0860527Z self.run() 2022-11-23T02:58:20.0860734Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0860882Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0861230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0861347Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0861716Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0861842Z getattr(self, test_name)() 2022-11-23T02:58:20.0862207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0862307Z fn() 2022-11-23T02:58:20.0862679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0862804Z test(self, **param_kwargs) 2022-11-23T02:58:20.0863173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0863281Z return func(*args, **kwargs) 2022-11-23T02:58:20.0863539Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0863654Z self.run_subtests( 2022-11-23T02:58:20.0864009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0864175Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0864547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0864707Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0865090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0865239Z output = model(*input) 2022-11-23T02:58:20.0865579Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0865721Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0866107Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0866285Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0866657Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0866779Z _lazy_init(state, module) 2022-11-23T02:58:20.0867189Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.0867317Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0867662Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0867787Z return func(*args, **kwargs) 2022-11-23T02:58:20.0868175Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0868277Z p_assert( 2022-11-23T02:58:20.0868617Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0868745Z traceback.print_stack() 2022-11-23T02:58:20.0869154Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0869386Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0869527Z File "", line 1, in 2022-11-23T02:58:20.0869740Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0869885Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0870093Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0870244Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0870461Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0870565Z self.run() 2022-11-23T02:58:20.0870754Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0870901Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0871253Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0871389Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0871764Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0871889Z getattr(self, test_name)() 2022-11-23T02:58:20.0872256Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0872340Z fn() 2022-11-23T02:58:20.0872711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0872841Z test(self, **param_kwargs) 2022-11-23T02:58:20.0873205Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0873332Z return func(*args, **kwargs) 2022-11-23T02:58:20.0873590Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0873704Z self.run_subtests( 2022-11-23T02:58:20.0874059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0874210Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0874656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0874823Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0875206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0875328Z output = model(*input) 2022-11-23T02:58:20.0875662Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0875810Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0876197Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0876360Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0876816Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0876940Z _lazy_init(state, module) 2022-11-23T02:58:20.0877306Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.0877453Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0877794Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0877921Z return func(*args, **kwargs) 2022-11-23T02:58:20.0878302Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0878387Z p_assert( 2022-11-23T02:58:20.0878726Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0878857Z traceback.print_stack() 2022-11-23T02:58:20.0878987Z File "", line 1, in 2022-11-23T02:58:20.0879198Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0879342Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0879548Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0879699Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0879899Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0879999Z self.run() 2022-11-23T02:58:20.0880204Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0880349Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0880694Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0880826Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0881202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0881326Z getattr(self, test_name)() 2022-11-23T02:58:20.0881676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0881776Z fn() 2022-11-23T02:58:20.0882145Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0882268Z test(self, **param_kwargs) 2022-11-23T02:58:20.0882629Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0882756Z return func(*args, **kwargs) 2022-11-23T02:58:20.0883017Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0883113Z self.run_subtests( 2022-11-23T02:58:20.0883474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0883638Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0884061Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0884223Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0884609Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0884730Z output = model(*input) 2022-11-23T02:58:20.0885060Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0885204Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0885571Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0885800Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0886174Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0886296Z _lazy_init(state, module) 2022-11-23T02:58:20.0886657Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.0886803Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0887146Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0887269Z return func(*args, **kwargs) 2022-11-23T02:58:20.0887633Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0887733Z p_assert( 2022-11-23T02:58:20.0888069Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0888200Z traceback.print_stack() 2022-11-23T02:58:20.0888439Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0888676Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0888811Z File "", line 1, in 2022-11-23T02:58:20.0889007Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0889150Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0889355Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0889508Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0889723Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0889827Z self.run() 2022-11-23T02:58:20.0890030Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0890180Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0890515Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0890647Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0891021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0891144Z getattr(self, test_name)() 2022-11-23T02:58:20.0891512Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0891606Z fn() 2022-11-23T02:58:20.0891976Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0892098Z test(self, **param_kwargs) 2022-11-23T02:58:20.0892441Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0892571Z return func(*args, **kwargs) 2022-11-23T02:58:20.0892825Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0892939Z self.run_subtests( 2022-11-23T02:58:20.0893345Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0893514Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0893886Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0894041Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0894404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0894528Z output = model(*input) 2022-11-23T02:58:20.0894857Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0895045Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0895434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0895612Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0895990Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0896112Z _lazy_init(state, module) 2022-11-23T02:58:20.0896453Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.0896599Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0896942Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0897067Z return func(*args, **kwargs) 2022-11-23T02:58:20.0897452Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0897559Z p_assert( 2022-11-23T02:58:20.0897904Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0898032Z traceback.print_stack() 2022-11-23T02:58:20.0898148Z File "", line 1, in 2022-11-23T02:58:20.0898363Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0898506Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0898710Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0898862Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0899078Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0899181Z self.run() 2022-11-23T02:58:20.0899368Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0899519Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0899866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0900002Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0900373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0900497Z getattr(self, test_name)() 2022-11-23T02:58:20.0900866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0900963Z fn() 2022-11-23T02:58:20.0901316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0901439Z test(self, **param_kwargs) 2022-11-23T02:58:20.0901798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0901927Z return func(*args, **kwargs) 2022-11-23T02:58:20.0902184Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0902298Z self.run_subtests( 2022-11-23T02:58:20.0902702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0902871Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0903222Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0903378Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0903752Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0903874Z output = model(*input) 2022-11-23T02:58:20.0904203Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0904408Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0904792Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0904976Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0905333Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0905455Z _lazy_init(state, module) 2022-11-23T02:58:20.0905811Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.0905955Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0906296Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0906422Z return func(*args, **kwargs) 2022-11-23T02:58:20.0906804Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0906913Z p_assert( 2022-11-23T02:58:20.0907236Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0907364Z traceback.print_stack() 2022-11-23T02:58:20.0907610Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0907850Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0907980Z File "", line 1, in 2022-11-23T02:58:20.0908192Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0908335Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0908537Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0908672Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0908890Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0909167Z self.run() 2022-11-23T02:58:20.0909385Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0909534Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0909894Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0910030Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0910382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0910509Z getattr(self, test_name)() 2022-11-23T02:58:20.0910873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0910972Z fn() 2022-11-23T02:58:20.0911342Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0911470Z test(self, **param_kwargs) 2022-11-23T02:58:20.0911834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0911960Z return func(*args, **kwargs) 2022-11-23T02:58:20.0912267Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0912391Z self.run_subtests( 2022-11-23T02:58:20.0912755Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0912920Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0913291Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0913445Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0913826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0914007Z output = model(*input) 2022-11-23T02:58:20.0914325Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0914471Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0914854Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0915033Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0915405Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0915527Z _lazy_init(state, module) 2022-11-23T02:58:20.0915881Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.0916026Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0916373Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0916481Z return func(*args, **kwargs) 2022-11-23T02:58:20.0916869Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0916973Z p_assert( 2022-11-23T02:58:20.0917314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0917439Z traceback.print_stack() 2022-11-23T02:58:20.0917568Z File "", line 1, in 2022-11-23T02:58:20.0917779Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0917908Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0918113Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0918266Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0918485Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0918590Z self.run() 2022-11-23T02:58:20.0918796Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0918947Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0919290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0919405Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0919770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0919891Z getattr(self, test_name)() 2022-11-23T02:58:20.0920256Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0920354Z fn() 2022-11-23T02:58:20.0920721Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0920846Z test(self, **param_kwargs) 2022-11-23T02:58:20.0921205Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0921314Z return func(*args, **kwargs) 2022-11-23T02:58:20.0921617Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0921736Z self.run_subtests( 2022-11-23T02:58:20.0922098Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0922261Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0922632Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0922786Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0923170Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0923327Z output = model(*input) 2022-11-23T02:58:20.0923661Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0923808Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0924194Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0924373Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0924745Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0924867Z _lazy_init(state, module) 2022-11-23T02:58:20.0925225Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.0925350Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0925699Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0925825Z return func(*args, **kwargs) 2022-11-23T02:58:20.0926213Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0926318Z p_assert( 2022-11-23T02:58:20.0926661Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0926788Z traceback.print_stack() 2022-11-23T02:58:20.0927030Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0927254Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0927386Z File "", line 1, in 2022-11-23T02:58:20.0927598Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0927747Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0927952Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0928105Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0928322Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0928410Z self.run() 2022-11-23T02:58:20.0928615Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0928762Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0929109Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0929241Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0929607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0929729Z getattr(self, test_name)() 2022-11-23T02:58:20.0930100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0930181Z fn() 2022-11-23T02:58:20.0930549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0930721Z test(self, **param_kwargs) 2022-11-23T02:58:20.0931096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0931222Z return func(*args, **kwargs) 2022-11-23T02:58:20.0931480Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0931595Z self.run_subtests( 2022-11-23T02:58:20.0931954Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0932099Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0932525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0932679Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0933062Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0933183Z output = model(*input) 2022-11-23T02:58:20.0933517Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0933657Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0934043Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0934205Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0934577Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0934705Z _lazy_init(state, module) 2022-11-23T02:58:20.0935065Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.0935211Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0935558Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0935683Z return func(*args, **kwargs) 2022-11-23T02:58:20.0936067Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0936152Z p_assert( 2022-11-23T02:58:20.0936493Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0936620Z traceback.print_stack() 2022-11-23T02:58:20.0936747Z File "", line 1, in 2022-11-23T02:58:20.0936960Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0937106Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0937311Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0937464Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0937665Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0937771Z self.run() 2022-11-23T02:58:20.0937974Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0938121Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0938466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0938600Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0938965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0939072Z getattr(self, test_name)() 2022-11-23T02:58:20.0939441Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0939540Z fn() 2022-11-23T02:58:20.0939912Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0940082Z test(self, **param_kwargs) 2022-11-23T02:58:20.0940454Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0940581Z return func(*args, **kwargs) 2022-11-23T02:58:20.0940838Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0940936Z self.run_subtests( 2022-11-23T02:58:20.0941290Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0941454Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0941928Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0942081Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0942468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0942589Z output = model(*input) 2022-11-23T02:58:20.0942916Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0943040Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0943425Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0943604Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0943973Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0944098Z _lazy_init(state, module) 2022-11-23T02:58:20.0944454Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.0944597Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0944946Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0945055Z return func(*args, **kwargs) 2022-11-23T02:58:20.0945437Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0945541Z p_assert( 2022-11-23T02:58:20.0945884Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0946011Z traceback.print_stack() 2022-11-23T02:58:20.0946254Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0946493Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0946629Z File "", line 1, in 2022-11-23T02:58:20.0946827Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0946971Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0947177Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0947330Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0947545Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0947649Z self.run() 2022-11-23T02:58:20.0947855Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0947999Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0948326Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0948465Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0948832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0949123Z getattr(self, test_name)() 2022-11-23T02:58:20.0949580Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0949689Z fn() 2022-11-23T02:58:20.0950102Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0950211Z test(self, **param_kwargs) 2022-11-23T02:58:20.0950574Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0950700Z return func(*args, **kwargs) 2022-11-23T02:58:20.0950959Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0951075Z self.run_subtests( 2022-11-23T02:58:20.0951505Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0951669Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0952045Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0952184Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0952567Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0952689Z output = model(*input) 2022-11-23T02:58:20.0953020Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0953163Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0953546Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0953725Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0954096Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0954216Z _lazy_init(state, module) 2022-11-23T02:58:20.0954559Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.0954705Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0955047Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0955172Z return func(*args, **kwargs) 2022-11-23T02:58:20.0955554Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0955662Z p_assert( 2022-11-23T02:58:20.0956002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0956117Z traceback.print_stack() 2022-11-23T02:58:20.0956247Z File "", line 1, in 2022-11-23T02:58:20.0956459Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0956604Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0956809Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0956962Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0957179Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0957284Z self.run() 2022-11-23T02:58:20.0957514Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0957667Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0958019Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0958159Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0958528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0958650Z getattr(self, test_name)() 2022-11-23T02:58:20.0959064Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0959169Z fn() 2022-11-23T02:58:20.0959528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0959651Z test(self, **param_kwargs) 2022-11-23T02:58:20.0960013Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0960141Z return func(*args, **kwargs) 2022-11-23T02:58:20.0960398Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0960573Z self.run_subtests( 2022-11-23T02:58:20.0960934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0961100Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0961453Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0961609Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0961990Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0962111Z output = model(*input) 2022-11-23T02:58:20.0962444Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0962589Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0962973Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0963156Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0963514Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0963642Z _lazy_init(state, module) 2022-11-23T02:58:20.0964002Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.0964147Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0964490Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0964619Z return func(*args, **kwargs) 2022-11-23T02:58:20.0965005Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0965110Z p_assert( 2022-11-23T02:58:20.0965434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0965567Z traceback.print_stack() 2022-11-23T02:58:20.0965810Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0966053Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0966185Z File "", line 1, in 2022-11-23T02:58:20.0966399Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0966543Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0966732Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0966885Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0967101Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0967206Z self.run() 2022-11-23T02:58:20.0967411Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0967564Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0967910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0968046Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0968442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0968575Z getattr(self, test_name)() 2022-11-23T02:58:20.0968943Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0969044Z fn() 2022-11-23T02:58:20.0969418Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0969543Z test(self, **param_kwargs) 2022-11-23T02:58:20.0969905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0970095Z return func(*args, **kwargs) 2022-11-23T02:58:20.0970333Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0970448Z self.run_subtests( 2022-11-23T02:58:20.0970812Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0970975Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0971348Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0971502Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0971882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0972007Z output = model(*input) 2022-11-23T02:58:20.0972320Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0972465Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0972848Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0973030Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0973404Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0973525Z _lazy_init(state, module) 2022-11-23T02:58:20.0973884Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.0974028Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0974352Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0974479Z return func(*args, **kwargs) 2022-11-23T02:58:20.0974869Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0974970Z p_assert( 2022-11-23T02:58:20.0975310Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0975442Z traceback.print_stack() 2022-11-23T02:58:20.0975574Z File "", line 1, in 2022-11-23T02:58:20.0975788Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0975914Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0976121Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0976272Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0976486Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0976590Z self.run() 2022-11-23T02:58:20.0976796Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0976947Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0977277Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0977408Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0977823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0977956Z getattr(self, test_name)() 2022-11-23T02:58:20.0978324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0978422Z fn() 2022-11-23T02:58:20.0978794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0978918Z test(self, **param_kwargs) 2022-11-23T02:58:20.0979262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0979441Z return func(*args, **kwargs) 2022-11-23T02:58:20.0979697Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0979813Z self.run_subtests( 2022-11-23T02:58:20.0980177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0980341Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0980711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0980866Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0981227Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0981349Z output = model(*input) 2022-11-23T02:58:20.0981680Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0981826Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0982211Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0982392Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0982767Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0982887Z _lazy_init(state, module) 2022-11-23T02:58:20.0983227Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.0983368Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0983713Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0983839Z return func(*args, **kwargs) 2022-11-23T02:58:20.0984228Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0984331Z p_assert( 2022-11-23T02:58:20.0984670Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0984800Z traceback.print_stack() 2022-11-23T02:58:20.0985028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0985267Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.0985397Z File "", line 1, in 2022-11-23T02:58:20.0985609Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0985750Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0985956Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0986107Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0986327Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0986412Z self.run() 2022-11-23T02:58:20.0986618Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0986811Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0987167Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0987301Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0987664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0987789Z getattr(self, test_name)() 2022-11-23T02:58:20.0988154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0988234Z fn() 2022-11-23T02:58:20.0988602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0988773Z test(self, **param_kwargs) 2022-11-23T02:58:20.0989367Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0989506Z return func(*args, **kwargs) 2022-11-23T02:58:20.0989767Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0989882Z self.run_subtests( 2022-11-23T02:58:20.0990226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0990391Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.0990766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.0990920Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.0991308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.0991429Z output = model(*input) 2022-11-23T02:58:20.0991761Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.0991907Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.0992295Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.0992457Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.0992827Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.0992950Z _lazy_init(state, module) 2022-11-23T02:58:20.0993304Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.0993454Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.0993797Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.0993921Z return func(*args, **kwargs) 2022-11-23T02:58:20.0994309Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.0994395Z p_assert( 2022-11-23T02:58:20.0994739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.0994868Z traceback.print_stack() 2022-11-23T02:58:20.0994999Z File "", line 1, in 2022-11-23T02:58:20.0995209Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.0995353Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.0995558Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.0995698Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.0995914Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.0996018Z self.run() 2022-11-23T02:58:20.0996225Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.0996442Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.0996802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.0996938Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.0997305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.0997412Z getattr(self, test_name)() 2022-11-23T02:58:20.0997779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.0997876Z fn() 2022-11-23T02:58:20.0998244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.0998431Z test(self, **param_kwargs) 2022-11-23T02:58:20.0998797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.0998926Z return func(*args, **kwargs) 2022-11-23T02:58:20.0999184Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.0999281Z self.run_subtests( 2022-11-23T02:58:20.0999641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.0999805Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1000175Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1000330Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1000717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1000838Z output = model(*input) 2022-11-23T02:58:20.1001171Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1001295Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1001676Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1001854Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1002227Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1002350Z _lazy_init(state, module) 2022-11-23T02:58:20.1002706Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1002855Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1003196Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1003305Z return func(*args, **kwargs) 2022-11-23T02:58:20.1003694Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1003796Z p_assert( 2022-11-23T02:58:20.1004137Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1004264Z traceback.print_stack() 2022-11-23T02:58:20.1004505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1004743Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1004872Z File "", line 1, in 2022-11-23T02:58:20.1005065Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1005213Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1005416Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1005567Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1005829Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1005941Z self.run() 2022-11-23T02:58:20.1006147Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1006278Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1006633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1006767Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1007136Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1007261Z getattr(self, test_name)() 2022-11-23T02:58:20.1007681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1007781Z fn() 2022-11-23T02:58:20.1008156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1008263Z test(self, **param_kwargs) 2022-11-23T02:58:20.1008632Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1008759Z return func(*args, **kwargs) 2022-11-23T02:58:20.1009014Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1009128Z self.run_subtests( 2022-11-23T02:58:20.1009484Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1009648Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1010025Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1010162Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1010546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1010669Z output = model(*input) 2022-11-23T02:58:20.1011000Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1011142Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1011523Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1011703Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1012075Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1012182Z _lazy_init(state, module) 2022-11-23T02:58:20.1012539Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1012685Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1013031Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1013158Z return func(*args, **kwargs) 2022-11-23T02:58:20.1013546Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1013650Z p_assert( 2022-11-23T02:58:20.1013993Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1014103Z traceback.print_stack() 2022-11-23T02:58:20.1014234Z File "", line 1, in 2022-11-23T02:58:20.1014445Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1014591Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1014801Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1014953Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1015217Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1015330Z self.run() 2022-11-23T02:58:20.1015521Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1015671Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1016022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1016158Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1016524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1016699Z getattr(self, test_name)() 2022-11-23T02:58:20.1017070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1017150Z fn() 2022-11-23T02:58:20.1017526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1017650Z test(self, **param_kwargs) 2022-11-23T02:58:20.1018016Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1018138Z return func(*args, **kwargs) 2022-11-23T02:58:20.1018396Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1018512Z self.run_subtests( 2022-11-23T02:58:20.1018869Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1019014Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1019387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1019541Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1019929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1020047Z output = model(*input) 2022-11-23T02:58:20.1020375Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1020521Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1020903Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1021064Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1021431Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1021558Z _lazy_init(state, module) 2022-11-23T02:58:20.1021913Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1022060Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1022404Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1022528Z return func(*args, **kwargs) 2022-11-23T02:58:20.1022915Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1023016Z p_assert( 2022-11-23T02:58:20.1023341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1023469Z traceback.print_stack() 2022-11-23T02:58:20.1023710Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1023954Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1024085Z File "", line 1, in 2022-11-23T02:58:20.1024301Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1024494Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1024687Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1024841Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1025055Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1025157Z self.run() 2022-11-23T02:58:20.1025366Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1025516Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1025871Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1026059Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1026415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1026539Z getattr(self, test_name)() 2022-11-23T02:58:20.1026908Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1027008Z fn() 2022-11-23T02:58:20.1027380Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1027504Z test(self, **param_kwargs) 2022-11-23T02:58:20.1027864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1027991Z return func(*args, **kwargs) 2022-11-23T02:58:20.1028229Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1028349Z self.run_subtests( 2022-11-23T02:58:20.1028711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1028877Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1029436Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1029595Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1029980Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1030102Z output = model(*input) 2022-11-23T02:58:20.1030414Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1030558Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1030941Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1031125Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1031500Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1031625Z _lazy_init(state, module) 2022-11-23T02:58:20.1031984Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1032128Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1032456Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1032582Z return func(*args, **kwargs) 2022-11-23T02:58:20.1032969Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1033074Z p_assert( 2022-11-23T02:58:20.1033416Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1033550Z traceback.print_stack() 2022-11-23T02:58:20.1033681Z File "", line 1, in 2022-11-23T02:58:20.1033946Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1034102Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1034305Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1034459Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1034675Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1034780Z self.run() 2022-11-23T02:58:20.1034987Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1035137Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1035473Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1035681Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1036052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1036180Z getattr(self, test_name)() 2022-11-23T02:58:20.1036551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1036651Z fn() 2022-11-23T02:58:20.1037021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1037145Z test(self, **param_kwargs) 2022-11-23T02:58:20.1037486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1037613Z return func(*args, **kwargs) 2022-11-23T02:58:20.1037871Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1037991Z self.run_subtests( 2022-11-23T02:58:20.1038346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1038510Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1038885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1039038Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1039404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1039526Z output = model(*input) 2022-11-23T02:58:20.1039855Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1039997Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1040377Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1040560Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1040936Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1041060Z _lazy_init(state, module) 2022-11-23T02:58:20.1041401Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1041547Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1041891Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1042017Z return func(*args, **kwargs) 2022-11-23T02:58:20.1042401Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1042505Z p_assert( 2022-11-23T02:58:20.1042851Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1042980Z traceback.print_stack() 2022-11-23T02:58:20.1043205Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1043494Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1043630Z File "", line 1, in 2022-11-23T02:58:20.1043844Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1043988Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1044194Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1044346Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1044563Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1044649Z self.run() 2022-11-23T02:58:20.1044903Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1045050Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1045403Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1045542Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1045913Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1046037Z getattr(self, test_name)() 2022-11-23T02:58:20.1046385Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1046486Z fn() 2022-11-23T02:58:20.1046859Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1046982Z test(self, **param_kwargs) 2022-11-23T02:58:20.1047340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1047470Z return func(*args, **kwargs) 2022-11-23T02:58:20.1047730Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1047850Z self.run_subtests( 2022-11-23T02:58:20.1048192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1048356Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1048727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1048880Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1049259Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1049379Z output = model(*input) 2022-11-23T02:58:20.1049713Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1049859Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1050272Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1050450Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1050826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1050947Z _lazy_init(state, module) 2022-11-23T02:58:20.1051305Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1051451Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1051793Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1051921Z return func(*args, **kwargs) 2022-11-23T02:58:20.1052288Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1052392Z p_assert( 2022-11-23T02:58:20.1052788Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1052925Z traceback.print_stack() 2022-11-23T02:58:20.1053057Z File "", line 1, in 2022-11-23T02:58:20.1053270Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1053416Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1053621Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1053757Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1053974Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1054077Z self.run() 2022-11-23T02:58:20.1054335Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1054483Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1054833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1054970Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1055338Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1055444Z getattr(self, test_name)() 2022-11-23T02:58:20.1055807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1055905Z fn() 2022-11-23T02:58:20.1056273Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1056397Z test(self, **param_kwargs) 2022-11-23T02:58:20.1056758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1056889Z return func(*args, **kwargs) 2022-11-23T02:58:20.1057128Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1057244Z self.run_subtests( 2022-11-23T02:58:20.1057601Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1057761Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1058130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1058282Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1058661Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1058781Z output = model(*input) 2022-11-23T02:58:20.1059097Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1059238Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1059627Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1059807Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1060177Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1060301Z _lazy_init(state, module) 2022-11-23T02:58:20.1060657Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1060802Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1061147Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1061260Z return func(*args, **kwargs) 2022-11-23T02:58:20.1061647Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1061748Z p_assert( 2022-11-23T02:58:20.1062149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1062283Z traceback.print_stack() 2022-11-23T02:58:20.1062527Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1062763Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1062876Z File "", line 1, in 2022-11-23T02:58:20.1063090Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1063233Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1063439Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1063638Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1063856Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1063961Z self.run() 2022-11-23T02:58:20.1064170Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1064299Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1064653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1064788Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1065156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1065279Z getattr(self, test_name)() 2022-11-23T02:58:20.1065644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1065742Z fn() 2022-11-23T02:58:20.1066119Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1066225Z test(self, **param_kwargs) 2022-11-23T02:58:20.1066592Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1066718Z return func(*args, **kwargs) 2022-11-23T02:58:20.1066973Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1067092Z self.run_subtests( 2022-11-23T02:58:20.1067449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1067614Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1067983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1068120Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1068509Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1068628Z output = model(*input) 2022-11-23T02:58:20.1069129Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1069288Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1069680Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1069861Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1070235Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1070339Z _lazy_init(state, module) 2022-11-23T02:58:20.1070698Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1070847Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1071192Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1071319Z return func(*args, **kwargs) 2022-11-23T02:58:20.1071775Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1071888Z p_assert( 2022-11-23T02:58:20.1072238Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1072347Z traceback.print_stack() 2022-11-23T02:58:20.1072477Z File "", line 1, in 2022-11-23T02:58:20.1072690Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1072836Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1073040Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1073261Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1073483Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1073569Z self.run() 2022-11-23T02:58:20.1073778Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1073926Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1074279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1074413Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1074781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1074905Z getattr(self, test_name)() 2022-11-23T02:58:20.1075270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1075355Z fn() 2022-11-23T02:58:20.1075727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1075851Z test(self, **param_kwargs) 2022-11-23T02:58:20.1076214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1076342Z return func(*args, **kwargs) 2022-11-23T02:58:20.1076598Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1076712Z self.run_subtests( 2022-11-23T02:58:20.1077066Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1077211Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1077583Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1077743Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1078120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1078242Z output = model(*input) 2022-11-23T02:58:20.1078578Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1078722Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1079103Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1079262Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1079632Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1079754Z _lazy_init(state, module) 2022-11-23T02:58:20.1080109Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1080258Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1080600Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1080725Z return func(*args, **kwargs) 2022-11-23T02:58:20.1081157Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1081248Z p_assert( 2022-11-23T02:58:20.1081593Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1081721Z traceback.print_stack() 2022-11-23T02:58:20.1081962Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1082201Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1082440Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1082726Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1082959Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1083178Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1083411Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1083641Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1083872Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1084102Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1084334Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1084564Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1084797Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1085028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1085784Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1086550Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1087307Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1088069Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1088820Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1089573Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1090416Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1091177Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1091418Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1091704Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1091941Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1092179Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1092415Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1092647Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1092881Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1093096Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1093326Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1093558Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1093785Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1094019Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1094251Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1094482Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1094707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1094937Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1095150Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1095377Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1095605Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1095832Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1096061Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1096292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1096520Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1096750Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1097494Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1107257Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1108105Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1108860Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1110074Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1110815Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1111553Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1112297Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1112539Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1112771Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1112999Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1113235Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1113470Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1113687Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1113925Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1114157Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1114391Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1114621Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1114732Z dist init r=1, world=2 2022-11-23T02:58:20.1115069Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1115394Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1115708Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1116007Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1116423Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1116749Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1117054Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1117354Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1117711Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1118012Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1118309Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1118605Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1118705Z dist init r=0, world=2 2022-11-23T02:58:20.1119009Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1119306Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1119611Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1119913Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1120218Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1120522Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1120825Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1121135Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1121440Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1121743Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1122050Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1122358Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1122458Z ok (6.014s) 2022-11-23T02:58:20.1122780Z test_nested_wrapped_model_offload_true_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91593 2022-11-23T02:58:20.1123047Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91594 2022-11-23T02:58:20.1123444Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.1123619Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.1124004Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.1124195Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.1124570Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.1124795Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.1125168Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.1125363Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.1125610Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.1125858Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.1126264Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.1126662Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.1126895Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.1127132Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.1127367Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1127591Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1128621Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.1128733Z warnings.warn( 2022-11-23T02:58:20.1129750Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.1129862Z warnings.warn( 2022-11-23T02:58:20.1129989Z File "", line 1, in 2022-11-23T02:58:20.1130200Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1130340Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1130546Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1130694Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1130907Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1130999Z self.run() 2022-11-23T02:58:20.1131202Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1131344Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1131742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1131881Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1132248Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1132371Z getattr(self, test_name)() 2022-11-23T02:58:20.1132721Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1132818Z fn() 2022-11-23T02:58:20.1133185Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1133306Z test(self, **param_kwargs) 2022-11-23T02:58:20.1133719Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1133838Z return func(*args, **kwargs) 2022-11-23T02:58:20.1134096Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1134207Z self.run_subtests( 2022-11-23T02:58:20.1134549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1134709Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1135080Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1135232Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1135612Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1135732Z output = model(*input) 2022-11-23T02:58:20.1136061Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1136200Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1136571Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1136748Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1137120Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1137241Z _lazy_init(state, module) 2022-11-23T02:58:20.1137595Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1137737Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1138078Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1138206Z return func(*args, **kwargs) 2022-11-23T02:58:20.1138578Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1138678Z p_assert( 2022-11-23T02:58:20.1139019Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1139143Z traceback.print_stack() 2022-11-23T02:58:20.1139268Z File "", line 1, in 2022-11-23T02:58:20.1139474Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1139614Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1139817Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1139951Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1140164Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1140270Z self.run() 2022-11-23T02:58:20.1140472Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1140616Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1141010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1141146Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1141501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1141621Z getattr(self, test_name)() 2022-11-23T02:58:20.1141985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1142082Z fn() 2022-11-23T02:58:20.1142449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1142568Z test(self, **param_kwargs) 2022-11-23T02:58:20.1142981Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1143104Z return func(*args, **kwargs) 2022-11-23T02:58:20.1143346Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1143461Z self.run_subtests( 2022-11-23T02:58:20.1143817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1143976Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1144339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1144491Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1144866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1144988Z output = model(*input) 2022-11-23T02:58:20.1145303Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1145439Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1145820Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1145997Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1146367Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1146483Z _lazy_init(state, module) 2022-11-23T02:58:20.1146836Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1146978Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1147318Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1147432Z return func(*args, **kwargs) 2022-11-23T02:58:20.1147813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1147910Z p_assert( 2022-11-23T02:58:20.1148252Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1148376Z traceback.print_stack() 2022-11-23T02:58:20.1148610Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1148847Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1149190Z File "", line 1, in 2022-11-23T02:58:20.1149418Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1149557Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1149759Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1149915Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1150166Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1150272Z self.run() 2022-11-23T02:58:20.1150548Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1150686Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1151044Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1151173Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1151543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1151664Z getattr(self, test_name)() 2022-11-23T02:58:20.1152027Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1152186Z fn() 2022-11-23T02:58:20.1152561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1152667Z test(self, **param_kwargs) 2022-11-23T02:58:20.1153031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1153155Z return func(*args, **kwargs) 2022-11-23T02:58:20.1153411Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1153523Z self.run_subtests( 2022-11-23T02:58:20.1153878Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1154038Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1154405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1154544Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1154924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1155039Z output = model(*input) 2022-11-23T02:58:20.1155369Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1155509Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1155886Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1156061Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1156433Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1156537Z _lazy_init(state, module) 2022-11-23T02:58:20.1156894Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1157040Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1157382Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1157505Z return func(*args, **kwargs) 2022-11-23T02:58:20.1157889Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1157991Z p_assert( 2022-11-23T02:58:20.1158328Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1158439Z traceback.print_stack() 2022-11-23T02:58:20.1158567Z File "", line 1, in 2022-11-23T02:58:20.1158777Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1158912Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1159113Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1159264Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1159478Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1159564Z self.run() 2022-11-23T02:58:20.1159817Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1159969Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1160316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1160449Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1160813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1160935Z getattr(self, test_name)() 2022-11-23T02:58:20.1161300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1161431Z fn() 2022-11-23T02:58:20.1161802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1161919Z test(self, **param_kwargs) 2022-11-23T02:58:20.1162282Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1162404Z return func(*args, **kwargs) 2022-11-23T02:58:20.1162659Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1162773Z self.run_subtests( 2022-11-23T02:58:20.1163128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1163274Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1163642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1163801Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1164181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1164296Z output = model(*input) 2022-11-23T02:58:20.1164627Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1164765Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1165148Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1165311Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1165679Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1165795Z _lazy_init(state, module) 2022-11-23T02:58:20.1166148Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1166295Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1166639Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1166766Z return func(*args, **kwargs) 2022-11-23T02:58:20.1167152Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1167239Z p_assert( 2022-11-23T02:58:20.1167578Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1167702Z traceback.print_stack() 2022-11-23T02:58:20.1167940Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1168178Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1168306Z File "", line 1, in 2022-11-23T02:58:20.1168518Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1168659Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1168847Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1169045Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1169262Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1169363Z self.run() 2022-11-23T02:58:20.1169569Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1169715Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1170064Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1170181Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1170547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1170728Z getattr(self, test_name)() 2022-11-23T02:58:20.1171095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1171191Z fn() 2022-11-23T02:58:20.1171563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1171683Z test(self, **param_kwargs) 2022-11-23T02:58:20.1172043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1172152Z return func(*args, **kwargs) 2022-11-23T02:58:20.1172405Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1172519Z self.run_subtests( 2022-11-23T02:58:20.1172874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1173039Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1173406Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1173556Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1173937Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1174040Z output = model(*input) 2022-11-23T02:58:20.1174369Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1174504Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1174886Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1175059Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1175437Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1175558Z _lazy_init(state, module) 2022-11-23T02:58:20.1175918Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1176046Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1176386Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1176510Z return func(*args, **kwargs) 2022-11-23T02:58:20.1176893Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1176993Z p_assert( 2022-11-23T02:58:20.1177332Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1177456Z traceback.print_stack() 2022-11-23T02:58:20.1177589Z File "", line 1, in 2022-11-23T02:58:20.1177785Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1177928Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1178177Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1178332Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1178544Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1178645Z self.run() 2022-11-23T02:58:20.1178848Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1178993Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1179322Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1179455Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1179818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1179989Z getattr(self, test_name)() 2022-11-23T02:58:20.1180353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1180449Z fn() 2022-11-23T02:58:20.1180822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1180929Z test(self, **param_kwargs) 2022-11-23T02:58:20.1181289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1181412Z return func(*args, **kwargs) 2022-11-23T02:58:20.1181668Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1181776Z self.run_subtests( 2022-11-23T02:58:20.1182131Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1182295Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1182656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1182797Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1183176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1183296Z output = model(*input) 2022-11-23T02:58:20.1183625Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1183765Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1184148Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1184324Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1184696Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1184817Z _lazy_init(state, module) 2022-11-23T02:58:20.1185160Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1185308Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1185646Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1185768Z return func(*args, **kwargs) 2022-11-23T02:58:20.1186151Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1186252Z p_assert( 2022-11-23T02:58:20.1186592Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1186702Z traceback.print_stack() 2022-11-23T02:58:20.1186941Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1187179Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1187308Z File "", line 1, in 2022-11-23T02:58:20.1187566Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1187716Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1187918Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1188067Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1188267Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1188369Z self.run() 2022-11-23T02:58:20.1188570Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1188714Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1189296Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1189438Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1189816Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1189939Z getattr(self, test_name)() 2022-11-23T02:58:20.1190289Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1190388Z fn() 2022-11-23T02:58:20.1190758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1190880Z test(self, **param_kwargs) 2022-11-23T02:58:20.1191236Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1191357Z return func(*args, **kwargs) 2022-11-23T02:58:20.1191614Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1191726Z self.run_subtests( 2022-11-23T02:58:20.1192066Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1192230Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1192597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1192750Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1193128Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1193242Z output = model(*input) 2022-11-23T02:58:20.1193570Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1193712Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1194082Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1194254Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1194628Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1194747Z _lazy_init(state, module) 2022-11-23T02:58:20.1195101Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1195243Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1195583Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1195703Z return func(*args, **kwargs) 2022-11-23T02:58:20.1196072Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1196170Z p_assert( 2022-11-23T02:58:20.1196510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1196636Z traceback.print_stack() 2022-11-23T02:58:20.1196760Z File "", line 1, in 2022-11-23T02:58:20.1197039Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1197185Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1197371Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1197523Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1197733Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1197833Z self.run() 2022-11-23T02:58:20.1198037Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1198182Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1198590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1198719Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1199075Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1199199Z getattr(self, test_name)() 2022-11-23T02:58:20.1199563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1199661Z fn() 2022-11-23T02:58:20.1200031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1200151Z test(self, **param_kwargs) 2022-11-23T02:58:20.1200506Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1200627Z return func(*args, **kwargs) 2022-11-23T02:58:20.1200870Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1200980Z self.run_subtests( 2022-11-23T02:58:20.1201340Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1201502Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1201872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1202025Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1202405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1202522Z output = model(*input) 2022-11-23T02:58:20.1202834Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1202976Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1203364Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1203542Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1203915Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1204034Z _lazy_init(state, module) 2022-11-23T02:58:20.1204389Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1204529Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1204853Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1204976Z return func(*args, **kwargs) 2022-11-23T02:58:20.1205358Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1205460Z p_assert( 2022-11-23T02:58:20.1205803Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1205931Z traceback.print_stack() 2022-11-23T02:58:20.1206217Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1206463Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1206577Z File "", line 1, in 2022-11-23T02:58:20.1206789Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1206927Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1207132Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1207282Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1207495Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1207642Z self.run() 2022-11-23T02:58:20.1207832Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1207975Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1208328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1208461Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1208826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1208948Z getattr(self, test_name)() 2022-11-23T02:58:20.1209310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1209407Z fn() 2022-11-23T02:58:20.1209760Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1209886Z test(self, **param_kwargs) 2022-11-23T02:58:20.1210250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1210372Z return func(*args, **kwargs) 2022-11-23T02:58:20.1210631Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1210741Z self.run_subtests( 2022-11-23T02:58:20.1211095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1211255Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1211608Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1211762Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1212141Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1212261Z output = model(*input) 2022-11-23T02:58:20.1212587Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1212727Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1213109Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1213284Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1213643Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1213763Z _lazy_init(state, module) 2022-11-23T02:58:20.1214115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1214257Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1214598Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1214725Z return func(*args, **kwargs) 2022-11-23T02:58:20.1215110Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1215251Z p_assert( 2022-11-23T02:58:20.1215584Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1215710Z traceback.print_stack() 2022-11-23T02:58:20.1215839Z File "", line 1, in 2022-11-23T02:58:20.1216049Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1216193Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1216397Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1216548Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1216764Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1216901Z self.run() 2022-11-23T02:58:20.1217105Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1217252Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1217603Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1217739Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1218105Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1218228Z getattr(self, test_name)() 2022-11-23T02:58:20.1218591Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1218671Z fn() 2022-11-23T02:58:20.1219043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1219168Z test(self, **param_kwargs) 2022-11-23T02:58:20.1219530Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1219654Z return func(*args, **kwargs) 2022-11-23T02:58:20.1219912Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1220024Z self.run_subtests( 2022-11-23T02:58:20.1220367Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1220529Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1220900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1221052Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1221430Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1221550Z output = model(*input) 2022-11-23T02:58:20.1221878Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1222016Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1222398Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1222559Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1222930Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1223049Z _lazy_init(state, module) 2022-11-23T02:58:20.1223404Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1223546Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1223886Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1224015Z return func(*args, **kwargs) 2022-11-23T02:58:20.1224397Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1224528Z p_assert( 2022-11-23T02:58:20.1224879Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1225005Z traceback.print_stack() 2022-11-23T02:58:20.1225250Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1225489Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1225616Z File "", line 1, in 2022-11-23T02:58:20.1225824Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1225952Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1226210Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1226361Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1226576Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1226680Z self.run() 2022-11-23T02:58:20.1226885Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1227029Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1227375Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1227492Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1227860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1227983Z getattr(self, test_name)() 2022-11-23T02:58:20.1228344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1228444Z fn() 2022-11-23T02:58:20.1228811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1229140Z test(self, **param_kwargs) 2022-11-23T02:58:20.1229530Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1229639Z return func(*args, **kwargs) 2022-11-23T02:58:20.1229895Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1230008Z self.run_subtests( 2022-11-23T02:58:20.1230358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1230519Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1230887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1231046Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1231425Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1231533Z output = model(*input) 2022-11-23T02:58:20.1231864Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1232005Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1232384Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1232560Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1232929Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1233049Z _lazy_init(state, module) 2022-11-23T02:58:20.1233410Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1233539Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1233948Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1234082Z return func(*args, **kwargs) 2022-11-23T02:58:20.1234470Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1234568Z p_assert( 2022-11-23T02:58:20.1234905Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1235031Z traceback.print_stack() 2022-11-23T02:58:20.1235157Z File "", line 1, in 2022-11-23T02:58:20.1235351Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1235490Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1235819Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1235969Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1236179Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1236283Z self.run() 2022-11-23T02:58:20.1236483Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1236613Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1236962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1237093Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1237459Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1237581Z getattr(self, test_name)() 2022-11-23T02:58:20.1237944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1238037Z fn() 2022-11-23T02:58:20.1238406Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1238513Z test(self, **param_kwargs) 2022-11-23T02:58:20.1238878Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1238998Z return func(*args, **kwargs) 2022-11-23T02:58:20.1239247Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1239358Z self.run_subtests( 2022-11-23T02:58:20.1239716Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1239874Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1240244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1240385Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1240764Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1240886Z output = model(*input) 2022-11-23T02:58:20.1241218Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1241359Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1241738Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1241913Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1242281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1242386Z _lazy_init(state, module) 2022-11-23T02:58:20.1242747Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1242887Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1243273Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1243399Z return func(*args, **kwargs) 2022-11-23T02:58:20.1243780Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1243876Z p_assert( 2022-11-23T02:58:20.1244216Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1244327Z traceback.print_stack() 2022-11-23T02:58:20.1244567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1244805Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1245000Z File "", line 1, in 2022-11-23T02:58:20.1245214Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1245355Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1245561Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1245714Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1245912Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1246015Z self.run() 2022-11-23T02:58:20.1246215Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1246360Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1246707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1246838Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1247208Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1247315Z getattr(self, test_name)() 2022-11-23T02:58:20.1247678Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1247779Z fn() 2022-11-23T02:58:20.1248146Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1248272Z test(self, **param_kwargs) 2022-11-23T02:58:20.1248633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1248753Z return func(*args, **kwargs) 2022-11-23T02:58:20.1249006Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1249102Z self.run_subtests( 2022-11-23T02:58:20.1249453Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1249615Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1249985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1250182Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1250566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1250682Z output = model(*input) 2022-11-23T02:58:20.1251010Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1251135Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1251516Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1251694Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1252067Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1252186Z _lazy_init(state, module) 2022-11-23T02:58:20.1252590Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1252739Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1253081Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1253190Z return func(*args, **kwargs) 2022-11-23T02:58:20.1253574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1253674Z p_assert( 2022-11-23T02:58:20.1254015Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1254192Z traceback.print_stack() 2022-11-23T02:58:20.1254319Z File "", line 1, in 2022-11-23T02:58:20.1254531Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1254673Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1254864Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1255012Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1255225Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1255326Z self.run() 2022-11-23T02:58:20.1255530Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1255674Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1256021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1256152Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1256507Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1256629Z getattr(self, test_name)() 2022-11-23T02:58:20.1256992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1257087Z fn() 2022-11-23T02:58:20.1257498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1257625Z test(self, **param_kwargs) 2022-11-23T02:58:20.1257985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1258112Z return func(*args, **kwargs) 2022-11-23T02:58:20.1258353Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1258465Z self.run_subtests( 2022-11-23T02:58:20.1258829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1258989Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1259363Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1259516Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1259897Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1260015Z output = model(*input) 2022-11-23T02:58:20.1260330Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1260470Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1260851Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1261033Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1261403Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1261525Z _lazy_init(state, module) 2022-11-23T02:58:20.1261928Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1262082Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1262411Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1262536Z return func(*args, **kwargs) 2022-11-23T02:58:20.1262914Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1263014Z p_assert( 2022-11-23T02:58:20.1263352Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1263526Z traceback.print_stack() 2022-11-23T02:58:20.1263762Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1264003Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1264119Z File "", line 1, in 2022-11-23T02:58:20.1264329Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1264469Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1264672Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1264821Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1265034Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1265137Z self.run() 2022-11-23T02:58:20.1265327Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1265477Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1265824Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1265955Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1266323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1266448Z getattr(self, test_name)() 2022-11-23T02:58:20.1266811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1266907Z fn() 2022-11-23T02:58:20.1267261Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1267383Z test(self, **param_kwargs) 2022-11-23T02:58:20.1267742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1267871Z return func(*args, **kwargs) 2022-11-23T02:58:20.1268126Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1268237Z self.run_subtests( 2022-11-23T02:58:20.1268598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1268759Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1269283Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1269448Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1269834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1269951Z output = model(*input) 2022-11-23T02:58:20.1270284Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1270432Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1270814Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1270987Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1271414Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1271542Z _lazy_init(state, module) 2022-11-23T02:58:20.1271901Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1272044Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1272386Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1272513Z return func(*args, **kwargs) 2022-11-23T02:58:20.1272895Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1273059Z p_assert( 2022-11-23T02:58:20.1273389Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1273513Z traceback.print_stack() 2022-11-23T02:58:20.1273646Z File "", line 1, in 2022-11-23T02:58:20.1273859Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1274000Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1274202Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1274353Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1274553Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1274657Z self.run() 2022-11-23T02:58:20.1274857Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1275005Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1275348Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1275482Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1275849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1275967Z getattr(self, test_name)() 2022-11-23T02:58:20.1276316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1276413Z fn() 2022-11-23T02:58:20.1276780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1276902Z test(self, **param_kwargs) 2022-11-23T02:58:20.1277255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1277381Z return func(*args, **kwargs) 2022-11-23T02:58:20.1277638Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1277750Z self.run_subtests( 2022-11-23T02:58:20.1278092Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1278256Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1278627Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1278780Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1279157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1279279Z output = model(*input) 2022-11-23T02:58:20.1279606Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1279753Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1280119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1280343Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1280727Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1280844Z _lazy_init(state, module) 2022-11-23T02:58:20.1281195Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1281338Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1281678Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1281800Z return func(*args, **kwargs) 2022-11-23T02:58:20.1282168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1282318Z p_assert( 2022-11-23T02:58:20.1282663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1282787Z traceback.print_stack() 2022-11-23T02:58:20.1283029Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1283266Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1283394Z File "", line 1, in 2022-11-23T02:58:20.1283604Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1283731Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1283932Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1284082Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1284298Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1284397Z self.run() 2022-11-23T02:58:20.1284598Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1284744Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1285096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1285213Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1285580Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1285698Z getattr(self, test_name)() 2022-11-23T02:58:20.1286059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1286151Z fn() 2022-11-23T02:58:20.1286521Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1286644Z test(self, **param_kwargs) 2022-11-23T02:58:20.1286990Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1287110Z return func(*args, **kwargs) 2022-11-23T02:58:20.1287368Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1287478Z self.run_subtests( 2022-11-23T02:58:20.1287831Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1287994Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1288359Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1288505Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1288867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1288989Z output = model(*input) 2022-11-23T02:58:20.1289318Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1289507Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1289900Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1290077Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1290442Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1290563Z _lazy_init(state, module) 2022-11-23T02:58:20.1290921Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1291048Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1291440Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1291565Z return func(*args, **kwargs) 2022-11-23T02:58:20.1291950Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1292049Z p_assert( 2022-11-23T02:58:20.1292391Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1292514Z traceback.print_stack() 2022-11-23T02:58:20.1292627Z File "", line 1, in 2022-11-23T02:58:20.1292837Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1292977Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1293175Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1293324Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1293540Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1293638Z self.run() 2022-11-23T02:58:20.1293835Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1293968Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1294316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1294449Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1294817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1294939Z getattr(self, test_name)() 2022-11-23T02:58:20.1295300Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1295395Z fn() 2022-11-23T02:58:20.1295766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1295876Z test(self, **param_kwargs) 2022-11-23T02:58:20.1296238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1296363Z return func(*args, **kwargs) 2022-11-23T02:58:20.1296622Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1296735Z self.run_subtests( 2022-11-23T02:58:20.1297087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1297247Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1297618Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1297756Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1298135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1298256Z output = model(*input) 2022-11-23T02:58:20.1298589Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1298778Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1299166Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1299340Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1299707Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1299812Z _lazy_init(state, module) 2022-11-23T02:58:20.1300166Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1300308Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1300712Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1300836Z return func(*args, **kwargs) 2022-11-23T02:58:20.1301220Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1301319Z p_assert( 2022-11-23T02:58:20.1301660Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1301771Z traceback.print_stack() 2022-11-23T02:58:20.1302009Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1302244Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1302370Z File "", line 1, in 2022-11-23T02:58:20.1302585Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1302729Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1302931Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1303066Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1303283Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1303383Z self.run() 2022-11-23T02:58:20.1303586Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1303732Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1304078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1304210Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1304574Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1304681Z getattr(self, test_name)() 2022-11-23T02:58:20.1305052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1305143Z fn() 2022-11-23T02:58:20.1305511Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1305641Z test(self, **param_kwargs) 2022-11-23T02:58:20.1306001Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1306125Z return func(*args, **kwargs) 2022-11-23T02:58:20.1306380Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1306477Z self.run_subtests( 2022-11-23T02:58:20.1306827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1306988Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1307358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1307512Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1307938Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1308056Z output = model(*input) 2022-11-23T02:58:20.1308386Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1308512Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1308891Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1309255Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1309635Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1309832Z _lazy_init(state, module) 2022-11-23T02:58:20.1310185Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1310325Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1310664Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1310773Z return func(*args, **kwargs) 2022-11-23T02:58:20.1311156Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1311252Z p_assert( 2022-11-23T02:58:20.1311593Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1311715Z traceback.print_stack() 2022-11-23T02:58:20.1311843Z File "", line 1, in 2022-11-23T02:58:20.1312051Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1312196Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1312383Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1312527Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1312741Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1312844Z self.run() 2022-11-23T02:58:20.1313047Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1313192Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1313539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1313656Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1314017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1314138Z getattr(self, test_name)() 2022-11-23T02:58:20.1314504Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1314600Z fn() 2022-11-23T02:58:20.1314967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1315086Z test(self, **param_kwargs) 2022-11-23T02:58:20.1315446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1315554Z return func(*args, **kwargs) 2022-11-23T02:58:20.1315807Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1315918Z self.run_subtests( 2022-11-23T02:58:20.1316269Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1316427Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1316799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1316949Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1317387Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1317499Z output = model(*input) 2022-11-23T02:58:20.1317832Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1317968Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1318347Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1318522Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1318885Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1319052Z _lazy_init(state, module) 2022-11-23T02:58:20.1319410Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1319538Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1319884Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1320008Z return func(*args, **kwargs) 2022-11-23T02:58:20.1320393Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1320493Z p_assert( 2022-11-23T02:58:20.1320837Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1320963Z traceback.print_stack() 2022-11-23T02:58:20.1321201Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1321423Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1321551Z File "", line 1, in 2022-11-23T02:58:20.1321765Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1321910Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1322115Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1322266Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1322482Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1322585Z self.run() 2022-11-23T02:58:20.1322774Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1322920Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1323264Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1323402Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1323767Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1323887Z getattr(self, test_name)() 2022-11-23T02:58:20.1324254Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1324349Z fn() 2022-11-23T02:58:20.1324707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1324825Z test(self, **param_kwargs) 2022-11-23T02:58:20.1325183Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1325306Z return func(*args, **kwargs) 2022-11-23T02:58:20.1325561Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1325674Z self.run_subtests( 2022-11-23T02:58:20.1326037Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1326183Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1326597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1326757Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1327134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1327248Z output = model(*input) 2022-11-23T02:58:20.1327575Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1327712Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1328094Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1328317Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1328676Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1328794Z _lazy_init(state, module) 2022-11-23T02:58:20.1329149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1329290Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1329631Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1329754Z return func(*args, **kwargs) 2022-11-23T02:58:20.1330133Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1330232Z p_assert( 2022-11-23T02:58:20.1330556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1330681Z traceback.print_stack() 2022-11-23T02:58:20.1330805Z File "", line 1, in 2022-11-23T02:58:20.1331014Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1331161Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1331366Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1331515Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1331712Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1331812Z self.run() 2022-11-23T02:58:20.1332010Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1332151Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1332493Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1332630Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1332995Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1333116Z getattr(self, test_name)() 2022-11-23T02:58:20.1333465Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1333561Z fn() 2022-11-23T02:58:20.1333929Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1334045Z test(self, **param_kwargs) 2022-11-23T02:58:20.1334402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1334527Z return func(*args, **kwargs) 2022-11-23T02:58:20.1334781Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1334894Z self.run_subtests( 2022-11-23T02:58:20.1335237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1335397Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1335810Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1335967Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1336344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1336459Z output = model(*input) 2022-11-23T02:58:20.1336789Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1336930Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1337292Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1337515Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1337887Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1338010Z _lazy_init(state, module) 2022-11-23T02:58:20.1338359Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1338501Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1338835Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1338957Z return func(*args, **kwargs) 2022-11-23T02:58:20.1339326Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1339428Z p_assert( 2022-11-23T02:58:20.1339767Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1339897Z traceback.print_stack() 2022-11-23T02:58:20.1340137Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1340376Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1340506Z File "", line 1, in 2022-11-23T02:58:20.1340712Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1340838Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1341036Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1341188Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1341401Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1341501Z self.run() 2022-11-23T02:58:20.1341704Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1341853Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1342186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1342315Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1342681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1342803Z getattr(self, test_name)() 2022-11-23T02:58:20.1343169Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1343263Z fn() 2022-11-23T02:58:20.1343630Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1343753Z test(self, **param_kwargs) 2022-11-23T02:58:20.1344098Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1344224Z return func(*args, **kwargs) 2022-11-23T02:58:20.1344476Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1344586Z self.run_subtests( 2022-11-23T02:58:20.1344989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1345158Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1345530Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1345681Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1346046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1346160Z output = model(*input) 2022-11-23T02:58:20.1346485Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1346673Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1347055Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1347234Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1347602Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1347722Z _lazy_init(state, module) 2022-11-23T02:58:20.1348062Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1348203Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1348542Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1348667Z return func(*args, **kwargs) 2022-11-23T02:58:20.1349216Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1349323Z p_assert( 2022-11-23T02:58:20.1349672Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1349802Z traceback.print_stack() 2022-11-23T02:58:20.1349916Z File "", line 1, in 2022-11-23T02:58:20.1350156Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1350298Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1350502Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1350654Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1350868Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1350970Z self.run() 2022-11-23T02:58:20.1351158Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1351307Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1351652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1351785Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1352152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1352275Z getattr(self, test_name)() 2022-11-23T02:58:20.1352639Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1352735Z fn() 2022-11-23T02:58:20.1353085Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1353207Z test(self, **param_kwargs) 2022-11-23T02:58:20.1353564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1353692Z return func(*args, **kwargs) 2022-11-23T02:58:20.1353950Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1354061Z self.run_subtests( 2022-11-23T02:58:20.1354482Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1354650Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1355004Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1355157Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1355530Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1355649Z output = model(*input) 2022-11-23T02:58:20.1355979Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1356182Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1356562Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1356740Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1357099Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1357215Z _lazy_init(state, module) 2022-11-23T02:58:20.1357563Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1357703Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1358041Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1358163Z return func(*args, **kwargs) 2022-11-23T02:58:20.1358549Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1358647Z p_assert( 2022-11-23T02:58:20.1358972Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1359100Z traceback.print_stack() 2022-11-23T02:58:20.1359339Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1359574Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1359801Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1360038Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1360267Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1360492Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1360712Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1360941Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1361174Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1361402Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1361627Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1361852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1362077Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1362301Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1362531Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1362743Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1362967Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1363239Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1363469Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1363696Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1363924Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1364147Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1364371Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1364644Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1364872Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1365097Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1365323Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1365550Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1365776Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1366000Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1366224Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1366447Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1366661Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1366887Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1367115Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1367341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1367566Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1367791Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1368015Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1368243Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1368454Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1368683Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1368908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1369138Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1369365Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1369591Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1369816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1370043Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1370137Z dist init r=1, world=2 2022-11-23T02:58:20.1370478Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1370810Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1371174Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1371497Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1371809Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1372118Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1372475Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1372788Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1373090Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1373394Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1373699Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1373993Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1374102Z dist init r=0, world=2 2022-11-23T02:58:20.1374436Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1374755Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1375065Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1375372Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1375676Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1375988Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1376293Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1376596Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1376900Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1377186Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1377492Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1377878Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1377986Z ok (6.415s) 2022-11-23T02:58:20.1378331Z test_nested_wrapped_model_offload_true_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91676 2022-11-23T02:58:20.1378554Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91677 2022-11-23T02:58:20.1378940Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.1379118Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.1379557Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.1379735Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.1380116Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.1380292Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.1380672Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.1380862Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.1381111Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.1381359Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.1381767Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.1382163Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.1382381Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.1382615Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.1382849Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1383081Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1384109Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.1384225Z warnings.warn( 2022-11-23T02:58:20.1385255Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.1385362Z warnings.warn( 2022-11-23T02:58:20.1385489Z File "", line 1, in 2022-11-23T02:58:20.1385704Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1385847Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1386038Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1386184Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1386449Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1386557Z self.run() 2022-11-23T02:58:20.1386759Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1386908Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1387259Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1387375Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1387741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1387864Z getattr(self, test_name)() 2022-11-23T02:58:20.1388281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1388377Z fn() 2022-11-23T02:58:20.1388745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1388869Z test(self, **param_kwargs) 2022-11-23T02:58:20.1389415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1389525Z return func(*args, **kwargs) 2022-11-23T02:58:20.1389781Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1389893Z self.run_subtests( 2022-11-23T02:58:20.1390250Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1390413Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1390785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1390935Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1391318Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1391422Z output = model(*input) 2022-11-23T02:58:20.1391754Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1391897Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1392274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1392449Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1392820Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1392942Z _lazy_init(state, module) 2022-11-23T02:58:20.1393295Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1393436Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1393764Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1393889Z return func(*args, **kwargs) 2022-11-23T02:58:20.1394274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1394373Z p_assert( 2022-11-23T02:58:20.1394709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1394835Z traceback.print_stack() 2022-11-23T02:58:20.1394962Z File "", line 1, in 2022-11-23T02:58:20.1395156Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1395302Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1395504Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1395650Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1395937Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1396049Z self.run() 2022-11-23T02:58:20.1396249Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1396395Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1396724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1396858Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1397221Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1397345Z getattr(self, test_name)() 2022-11-23T02:58:20.1397772Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1397866Z fn() 2022-11-23T02:58:20.1398234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1398358Z test(self, **param_kwargs) 2022-11-23T02:58:20.1398703Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1398825Z return func(*args, **kwargs) 2022-11-23T02:58:20.1399082Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1399195Z self.run_subtests( 2022-11-23T02:58:20.1399547Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1399705Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1400082Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1400233Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1400602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1400724Z output = model(*input) 2022-11-23T02:58:20.1401054Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1401194Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1401574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1401753Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1402118Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1402240Z _lazy_init(state, module) 2022-11-23T02:58:20.1402580Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1402723Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1403068Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1403190Z return func(*args, **kwargs) 2022-11-23T02:58:20.1403574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1403675Z p_assert( 2022-11-23T02:58:20.1404009Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1404132Z traceback.print_stack() 2022-11-23T02:58:20.1404358Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1404596Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1404731Z File "", line 1, in 2022-11-23T02:58:20.1404946Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1405130Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1405341Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1405492Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1405692Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1405793Z self.run() 2022-11-23T02:58:20.1405997Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1406141Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1406491Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1406672Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1407039Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1407159Z getattr(self, test_name)() 2022-11-23T02:58:20.1407512Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1407609Z fn() 2022-11-23T02:58:20.1407976Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1408100Z test(self, **param_kwargs) 2022-11-23T02:58:20.1408456Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1408576Z return func(*args, **kwargs) 2022-11-23T02:58:20.1408827Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1408943Z self.run_subtests( 2022-11-23T02:58:20.1409285Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1409445Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1409817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1409970Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1410351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1410464Z output = model(*input) 2022-11-23T02:58:20.1410793Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1410934Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1411299Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1411477Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1411844Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1411966Z _lazy_init(state, module) 2022-11-23T02:58:20.1412323Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1412463Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1412799Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1412924Z return func(*args, **kwargs) 2022-11-23T02:58:20.1413290Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1413391Z p_assert( 2022-11-23T02:58:20.1413728Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1413857Z traceback.print_stack() 2022-11-23T02:58:20.1413990Z File "", line 1, in 2022-11-23T02:58:20.1414198Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1414387Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1414601Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1414736Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1414949Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1415048Z self.run() 2022-11-23T02:58:20.1415253Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1415395Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1415742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1415917Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1416270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1416388Z getattr(self, test_name)() 2022-11-23T02:58:20.1416754Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1416853Z fn() 2022-11-23T02:58:20.1417223Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1417343Z test(self, **param_kwargs) 2022-11-23T02:58:20.1417700Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1417822Z return func(*args, **kwargs) 2022-11-23T02:58:20.1418061Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1418177Z self.run_subtests( 2022-11-23T02:58:20.1418528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1418686Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1419059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1419214Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1419590Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1419708Z output = model(*input) 2022-11-23T02:58:20.1420022Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1420162Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1420542Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1420721Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1421090Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1421210Z _lazy_init(state, module) 2022-11-23T02:58:20.1421564Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1421705Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1422029Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1422151Z return func(*args, **kwargs) 2022-11-23T02:58:20.1422535Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1422634Z p_assert( 2022-11-23T02:58:20.1422976Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1423105Z traceback.print_stack() 2022-11-23T02:58:20.1423339Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1423637Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1423758Z File "", line 1, in 2022-11-23T02:58:20.1423973Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1424114Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1424317Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1424460Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1424677Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1424781Z self.run() 2022-11-23T02:58:20.1424984Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1425176Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1425523Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1425653Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1426022Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1426148Z getattr(self, test_name)() 2022-11-23T02:58:20.1426513Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1426609Z fn() 2022-11-23T02:58:20.1426962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1427085Z test(self, **param_kwargs) 2022-11-23T02:58:20.1427444Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1427572Z return func(*args, **kwargs) 2022-11-23T02:58:20.1427825Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1427936Z self.run_subtests( 2022-11-23T02:58:20.1428295Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1428458Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1428810Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1429122Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1429525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1429642Z output = model(*input) 2022-11-23T02:58:20.1429971Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1430117Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1430495Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1430670Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1431041Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1431146Z _lazy_init(state, module) 2022-11-23T02:58:20.1431496Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1431637Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1431977Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1432098Z return func(*args, **kwargs) 2022-11-23T02:58:20.1432480Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1432582Z p_assert( 2022-11-23T02:58:20.1432904Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1433103Z traceback.print_stack() 2022-11-23T02:58:20.1433242Z File "", line 1, in 2022-11-23T02:58:20.1433453Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1433597Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1433796Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1433945Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1434158Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1434252Z self.run() 2022-11-23T02:58:20.1434454Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1434662Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1435005Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1435137Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1435505Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1435628Z getattr(self, test_name)() 2022-11-23T02:58:20.1435989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1436070Z fn() 2022-11-23T02:58:20.1436434Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1436554Z test(self, **param_kwargs) 2022-11-23T02:58:20.1436913Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1437039Z return func(*args, **kwargs) 2022-11-23T02:58:20.1437297Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1437408Z self.run_subtests( 2022-11-23T02:58:20.1437770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1437915Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1438286Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1438435Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1438818Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1438936Z output = model(*input) 2022-11-23T02:58:20.1439260Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1439405Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1439783Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1439947Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1440322Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1440441Z _lazy_init(state, module) 2022-11-23T02:58:20.1440796Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1440937Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1441275Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1441398Z return func(*args, **kwargs) 2022-11-23T02:58:20.1441785Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1441870Z p_assert( 2022-11-23T02:58:20.1442256Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1442389Z traceback.print_stack() 2022-11-23T02:58:20.1442631Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1442871Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1442994Z File "", line 1, in 2022-11-23T02:58:20.1443198Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1443324Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1443524Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1443671Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1443931Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1444033Z self.run() 2022-11-23T02:58:20.1444236Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1444386Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1444732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1444850Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1445217Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1445338Z getattr(self, test_name)() 2022-11-23T02:58:20.1445704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1445800Z fn() 2022-11-23T02:58:20.1446168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1446295Z test(self, **param_kwargs) 2022-11-23T02:58:20.1446658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1446771Z return func(*args, **kwargs) 2022-11-23T02:58:20.1447026Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1447137Z self.run_subtests( 2022-11-23T02:58:20.1447491Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1447648Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1448011Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1448164Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1448550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1448653Z output = model(*input) 2022-11-23T02:58:20.1448987Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1449126Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1449510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1449684Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1450053Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1450212Z _lazy_init(state, module) 2022-11-23T02:58:20.1450575Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1450708Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1451051Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1451172Z return func(*args, **kwargs) 2022-11-23T02:58:20.1451603Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1451707Z p_assert( 2022-11-23T02:58:20.1452049Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1452176Z traceback.print_stack() 2022-11-23T02:58:20.1452303Z File "", line 1, in 2022-11-23T02:58:20.1452498Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1452639Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1452841Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1453040Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1453250Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1453351Z self.run() 2022-11-23T02:58:20.1453555Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1453689Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1454038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1454168Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1454531Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1454653Z getattr(self, test_name)() 2022-11-23T02:58:20.1455016Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1455108Z fn() 2022-11-23T02:58:20.1455476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1455587Z test(self, **param_kwargs) 2022-11-23T02:58:20.1455947Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1456070Z return func(*args, **kwargs) 2022-11-23T02:58:20.1456321Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1456429Z self.run_subtests( 2022-11-23T02:58:20.1456780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1456936Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1457301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1457438Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1457821Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1457939Z output = model(*input) 2022-11-23T02:58:20.1458273Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1458410Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1458792Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1458965Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1459336Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1459440Z _lazy_init(state, module) 2022-11-23T02:58:20.1459794Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1459938Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1460280Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1460401Z return func(*args, **kwargs) 2022-11-23T02:58:20.1460830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1460938Z p_assert( 2022-11-23T02:58:20.1461282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1461392Z traceback.print_stack() 2022-11-23T02:58:20.1461626Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1461859Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1461985Z File "", line 1, in 2022-11-23T02:58:20.1462194Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1462381Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1462583Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1462731Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1462933Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1463033Z self.run() 2022-11-23T02:58:20.1463234Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1463379Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1463730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1463863Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1464230Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1464356Z getattr(self, test_name)() 2022-11-23T02:58:20.1464708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1464800Z fn() 2022-11-23T02:58:20.1465166Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1465286Z test(self, **param_kwargs) 2022-11-23T02:58:20.1465644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1465764Z return func(*args, **kwargs) 2022-11-23T02:58:20.1466018Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1466114Z self.run_subtests( 2022-11-23T02:58:20.1466464Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1466620Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1466993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1467142Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1467522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1467640Z output = model(*input) 2022-11-23T02:58:20.1467968Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1468093Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1468474Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1468652Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1469182Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1469310Z _lazy_init(state, module) 2022-11-23T02:58:20.1469673Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1469814Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1470229Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1470358Z return func(*args, **kwargs) 2022-11-23T02:58:20.1470727Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1470826Z p_assert( 2022-11-23T02:58:20.1471162Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1471286Z traceback.print_stack() 2022-11-23T02:58:20.1471409Z File "", line 1, in 2022-11-23T02:58:20.1471616Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1471820Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1472008Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1472156Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1472367Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1472469Z self.run() 2022-11-23T02:58:20.1472669Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1472810Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1473157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1473287Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1473638Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1473761Z getattr(self, test_name)() 2022-11-23T02:58:20.1474123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1474217Z fn() 2022-11-23T02:58:20.1474588Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1474704Z test(self, **param_kwargs) 2022-11-23T02:58:20.1475064Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1475186Z return func(*args, **kwargs) 2022-11-23T02:58:20.1475424Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1475535Z self.run_subtests( 2022-11-23T02:58:20.1475890Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1476056Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1476428Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1476581Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1476966Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1477082Z output = model(*input) 2022-11-23T02:58:20.1477395Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1477535Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1477915Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1478089Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1478459Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1478584Z _lazy_init(state, module) 2022-11-23T02:58:20.1478938Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1479132Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1479467Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1479592Z return func(*args, **kwargs) 2022-11-23T02:58:20.1479975Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1480078Z p_assert( 2022-11-23T02:58:20.1480416Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1480541Z traceback.print_stack() 2022-11-23T02:58:20.1480783Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1481070Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1481184Z File "", line 1, in 2022-11-23T02:58:20.1481396Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1481539Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1481742Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1481889Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1482101Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1482200Z self.run() 2022-11-23T02:58:20.1482391Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1482539Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1482887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1483024Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1483389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1483510Z getattr(self, test_name)() 2022-11-23T02:58:20.1483874Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1483967Z fn() 2022-11-23T02:58:20.1484320Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1484441Z test(self, **param_kwargs) 2022-11-23T02:58:20.1484798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1484922Z return func(*args, **kwargs) 2022-11-23T02:58:20.1485179Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1485296Z self.run_subtests( 2022-11-23T02:58:20.1485650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1485806Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1486160Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1486311Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1486688Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1486807Z output = model(*input) 2022-11-23T02:58:20.1487134Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1487272Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1487651Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1487832Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1488187Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1488352Z _lazy_init(state, module) 2022-11-23T02:58:20.1488718Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1488858Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1489194Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1489312Z return func(*args, **kwargs) 2022-11-23T02:58:20.1489690Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1489791Z p_assert( 2022-11-23T02:58:20.1490115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1490305Z traceback.print_stack() 2022-11-23T02:58:20.1490433Z File "", line 1, in 2022-11-23T02:58:20.1490649Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1490787Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1490992Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1491139Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1491350Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1491436Z self.run() 2022-11-23T02:58:20.1491633Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1491777Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1492123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1492257Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1492618Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1492735Z getattr(self, test_name)() 2022-11-23T02:58:20.1493086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1493179Z fn() 2022-11-23T02:58:20.1493544Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1493664Z test(self, **param_kwargs) 2022-11-23T02:58:20.1494025Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1494146Z return func(*args, **kwargs) 2022-11-23T02:58:20.1494399Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1494514Z self.run_subtests( 2022-11-23T02:58:20.1494854Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1495015Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1495389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1495541Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1495923Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1496041Z output = model(*input) 2022-11-23T02:58:20.1496370Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1496509Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1496873Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1497053Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1497474Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1497596Z _lazy_init(state, module) 2022-11-23T02:58:20.1497947Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1498098Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1498440Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1498565Z return func(*args, **kwargs) 2022-11-23T02:58:20.1498932Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1499039Z p_assert( 2022-11-23T02:58:20.1499435Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1499563Z traceback.print_stack() 2022-11-23T02:58:20.1499804Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1500047Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1500173Z File "", line 1, in 2022-11-23T02:58:20.1500383Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1500508Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1500706Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1500857Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1501079Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1501180Z self.run() 2022-11-23T02:58:20.1501386Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1501532Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1501864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1502002Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1502369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1502491Z getattr(self, test_name)() 2022-11-23T02:58:20.1502858Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1502957Z fn() 2022-11-23T02:58:20.1503328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1503449Z test(self, **param_kwargs) 2022-11-23T02:58:20.1503794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1503923Z return func(*args, **kwargs) 2022-11-23T02:58:20.1504176Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1504291Z self.run_subtests( 2022-11-23T02:58:20.1504643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1504805Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1505174Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1505324Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1505688Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1505808Z output = model(*input) 2022-11-23T02:58:20.1506139Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1506279Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1506711Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1506895Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1507261Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1507382Z _lazy_init(state, module) 2022-11-23T02:58:20.1507723Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1507862Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1508205Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1508376Z return func(*args, **kwargs) 2022-11-23T02:58:20.1508767Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1508866Z p_assert( 2022-11-23T02:58:20.1509388Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1509515Z traceback.print_stack() 2022-11-23T02:58:20.1509627Z File "", line 1, in 2022-11-23T02:58:20.1509836Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1509976Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1510177Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1510329Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1510541Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1510641Z self.run() 2022-11-23T02:58:20.1510834Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1510978Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1511324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1511460Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1511829Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1511949Z getattr(self, test_name)() 2022-11-23T02:58:20.1512308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1512406Z fn() 2022-11-23T02:58:20.1512760Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1512879Z test(self, **param_kwargs) 2022-11-23T02:58:20.1513238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1513361Z return func(*args, **kwargs) 2022-11-23T02:58:20.1513618Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1513731Z self.run_subtests( 2022-11-23T02:58:20.1514083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1514246Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1514602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1514754Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1515133Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1515249Z output = model(*input) 2022-11-23T02:58:20.1515582Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1515720Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1516167Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1516355Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1516715Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1516838Z _lazy_init(state, module) 2022-11-23T02:58:20.1517194Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1517338Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1517680Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1517915Z return func(*args, **kwargs) 2022-11-23T02:58:20.1518303Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1518403Z p_assert( 2022-11-23T02:58:20.1518732Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1518860Z traceback.print_stack() 2022-11-23T02:58:20.1519102Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1519342Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1519470Z File "", line 1, in 2022-11-23T02:58:20.1519682Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1519822Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1520022Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1520160Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1520374Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1520477Z self.run() 2022-11-23T02:58:20.1520680Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1520828Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1521176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1521307Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1521669Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1521775Z getattr(self, test_name)() 2022-11-23T02:58:20.1522142Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1522236Z fn() 2022-11-23T02:58:20.1522612Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1522738Z test(self, **param_kwargs) 2022-11-23T02:58:20.1523104Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1523229Z return func(*args, **kwargs) 2022-11-23T02:58:20.1523482Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1523578Z self.run_subtests( 2022-11-23T02:58:20.1523934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1524097Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1524461Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1524621Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1525000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1525118Z output = model(*input) 2022-11-23T02:58:20.1525494Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1525625Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1526011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1526186Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1526558Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1526677Z _lazy_init(state, module) 2022-11-23T02:58:20.1527031Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1527226Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1527575Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1527684Z return func(*args, **kwargs) 2022-11-23T02:58:20.1528073Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1528176Z p_assert( 2022-11-23T02:58:20.1528519Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1528642Z traceback.print_stack() 2022-11-23T02:58:20.1528769Z File "", line 1, in 2022-11-23T02:58:20.1528980Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1529107Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1529311Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1529465Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1529682Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1529782Z self.run() 2022-11-23T02:58:20.1529992Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1530135Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1530481Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1530598Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1530968Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1531091Z getattr(self, test_name)() 2022-11-23T02:58:20.1531452Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1531556Z fn() 2022-11-23T02:58:20.1531924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1532045Z test(self, **param_kwargs) 2022-11-23T02:58:20.1532409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1532518Z return func(*args, **kwargs) 2022-11-23T02:58:20.1532776Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1532888Z self.run_subtests( 2022-11-23T02:58:20.1533245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1533407Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1533776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1533930Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1534310Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1534412Z output = model(*input) 2022-11-23T02:58:20.1534794Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1534939Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1535321Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1535496Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1535862Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1535979Z _lazy_init(state, module) 2022-11-23T02:58:20.1536331Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1536507Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1536844Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1536966Z return func(*args, **kwargs) 2022-11-23T02:58:20.1537350Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1537446Z p_assert( 2022-11-23T02:58:20.1537783Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1537906Z traceback.print_stack() 2022-11-23T02:58:20.1538146Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1538370Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1538500Z File "", line 1, in 2022-11-23T02:58:20.1538713Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1538854Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1539058Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1539209Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1539427Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1539513Z self.run() 2022-11-23T02:58:20.1539717Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1539866Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1540213Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1540343Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1540706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1540834Z getattr(self, test_name)() 2022-11-23T02:58:20.1541202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1541283Z fn() 2022-11-23T02:58:20.1541651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1541770Z test(self, **param_kwargs) 2022-11-23T02:58:20.1542127Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1542249Z return func(*args, **kwargs) 2022-11-23T02:58:20.1542501Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1542612Z self.run_subtests( 2022-11-23T02:58:20.1542961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1543111Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1543479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1543633Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1544059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1544182Z output = model(*input) 2022-11-23T02:58:20.1544510Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1544648Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1545024Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1545187Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1545555Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1545720Z _lazy_init(state, module) 2022-11-23T02:58:20.1546079Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1546229Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1546573Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1546696Z return func(*args, **kwargs) 2022-11-23T02:58:20.1547077Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1547163Z p_assert( 2022-11-23T02:58:20.1547503Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1547626Z traceback.print_stack() 2022-11-23T02:58:20.1547752Z File "", line 1, in 2022-11-23T02:58:20.1547966Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1548106Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1548310Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1548462Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1548662Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1548761Z self.run() 2022-11-23T02:58:20.1549131Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1549286Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1549638Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1549773Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1550135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1550279Z getattr(self, test_name)() 2022-11-23T02:58:20.1550649Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1550745Z fn() 2022-11-23T02:58:20.1551119Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1551243Z test(self, **param_kwargs) 2022-11-23T02:58:20.1551607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1551732Z return func(*args, **kwargs) 2022-11-23T02:58:20.1551986Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1552083Z self.run_subtests( 2022-11-23T02:58:20.1552439Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1552603Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1552973Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1553200Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1553593Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1553711Z output = model(*input) 2022-11-23T02:58:20.1554039Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1554165Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1554551Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1554727Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1555207Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1555327Z _lazy_init(state, module) 2022-11-23T02:58:20.1555681Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1555830Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1556176Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1556284Z return func(*args, **kwargs) 2022-11-23T02:58:20.1556663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1556763Z p_assert( 2022-11-23T02:58:20.1557108Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1557234Z traceback.print_stack() 2022-11-23T02:58:20.1557475Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1557765Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1557899Z File "", line 1, in 2022-11-23T02:58:20.1558099Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1558243Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1558445Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1558596Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1558812Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1558917Z self.run() 2022-11-23T02:58:20.1559117Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1559264Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1559598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1559735Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1560101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1560227Z getattr(self, test_name)() 2022-11-23T02:58:20.1560594Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1560690Z fn() 2022-11-23T02:58:20.1561057Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1561179Z test(self, **param_kwargs) 2022-11-23T02:58:20.1561523Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1561647Z return func(*args, **kwargs) 2022-11-23T02:58:20.1561905Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1562022Z self.run_subtests( 2022-11-23T02:58:20.1562376Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1562587Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1562961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1563113Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1563475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1563592Z output = model(*input) 2022-11-23T02:58:20.1563916Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1564054Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1564484Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1564656Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1565027Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1565144Z _lazy_init(state, module) 2022-11-23T02:58:20.1565487Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1565630Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1565967Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1566091Z return func(*args, **kwargs) 2022-11-23T02:58:20.1566478Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1566581Z p_assert( 2022-11-23T02:58:20.1566913Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1567039Z traceback.print_stack() 2022-11-23T02:58:20.1567151Z File "", line 1, in 2022-11-23T02:58:20.1567363Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1567501Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1567699Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1567848Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1568059Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1568161Z self.run() 2022-11-23T02:58:20.1568349Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1568493Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1568844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1568979Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1569344Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1569469Z getattr(self, test_name)() 2022-11-23T02:58:20.1569833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1569925Z fn() 2022-11-23T02:58:20.1570281Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1570402Z test(self, **param_kwargs) 2022-11-23T02:58:20.1570761Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1570884Z return func(*args, **kwargs) 2022-11-23T02:58:20.1571134Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1571251Z self.run_subtests( 2022-11-23T02:58:20.1571607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1571818Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1572178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1572332Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1572709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1572828Z output = model(*input) 2022-11-23T02:58:20.1573158Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1573300Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1573735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1573911Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1574268Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1574387Z _lazy_init(state, module) 2022-11-23T02:58:20.1574742Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1574886Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1575226Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1575351Z return func(*args, **kwargs) 2022-11-23T02:58:20.1575734Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1575839Z p_assert( 2022-11-23T02:58:20.1576165Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1576288Z traceback.print_stack() 2022-11-23T02:58:20.1576533Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1576773Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1576900Z File "", line 1, in 2022-11-23T02:58:20.1577108Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1577248Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1577448Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1577583Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1577797Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1577899Z self.run() 2022-11-23T02:58:20.1578100Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1578246Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1578594Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1578731Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1579085Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1579210Z getattr(self, test_name)() 2022-11-23T02:58:20.1579576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1579670Z fn() 2022-11-23T02:58:20.1580037Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1580158Z test(self, **param_kwargs) 2022-11-23T02:58:20.1580525Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1580648Z return func(*args, **kwargs) 2022-11-23T02:58:20.1580940Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1581060Z self.run_subtests( 2022-11-23T02:58:20.1581416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1581576Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1581945Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1582099Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1582475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1582643Z output = model(*input) 2022-11-23T02:58:20.1582958Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1583101Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1583483Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1583660Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1584027Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1584146Z _lazy_init(state, module) 2022-11-23T02:58:20.1584500Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1584641Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1584970Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1585099Z return func(*args, **kwargs) 2022-11-23T02:58:20.1585481Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1585583Z p_assert( 2022-11-23T02:58:20.1585929Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1586053Z traceback.print_stack() 2022-11-23T02:58:20.1586179Z File "", line 1, in 2022-11-23T02:58:20.1586390Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1586516Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1586722Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1586873Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1587089Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1587192Z self.run() 2022-11-23T02:58:20.1587398Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1587547Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1587896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1588011Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1588379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1588498Z getattr(self, test_name)() 2022-11-23T02:58:20.1588864Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1589216Z fn() 2022-11-23T02:58:20.1589615Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1589738Z test(self, **param_kwargs) 2022-11-23T02:58:20.1590091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1590217Z return func(*args, **kwargs) 2022-11-23T02:58:20.1590551Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1590675Z self.run_subtests( 2022-11-23T02:58:20.1591037Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1591199Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1591569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1591723Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1592089Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1592272Z output = model(*input) 2022-11-23T02:58:20.1592606Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1592748Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1593133Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1593313Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1593683Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1593805Z _lazy_init(state, module) 2022-11-23T02:58:20.1594164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1594291Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1594633Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1594761Z return func(*args, **kwargs) 2022-11-23T02:58:20.1595143Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1595236Z p_assert( 2022-11-23T02:58:20.1595579Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1595706Z traceback.print_stack() 2022-11-23T02:58:20.1595932Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1596172Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1596299Z File "", line 1, in 2022-11-23T02:58:20.1596515Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1596655Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1596856Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1597014Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1597229Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1597314Z self.run() 2022-11-23T02:58:20.1597522Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1597664Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1598010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1598145Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1598512Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1598635Z getattr(self, test_name)() 2022-11-23T02:58:20.1599000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1599084Z fn() 2022-11-23T02:58:20.1599456Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1599581Z test(self, **param_kwargs) 2022-11-23T02:58:20.1599990Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1600123Z return func(*args, **kwargs) 2022-11-23T02:58:20.1600382Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1600496Z self.run_subtests( 2022-11-23T02:58:20.1600855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1601003Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1601373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1601597Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1601984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1602104Z output = model(*input) 2022-11-23T02:58:20.1602434Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1602579Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1602964Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1603126Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1603497Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1603616Z _lazy_init(state, module) 2022-11-23T02:58:20.1603973Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1604122Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1604467Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1604594Z return func(*args, **kwargs) 2022-11-23T02:58:20.1604982Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1605068Z p_assert( 2022-11-23T02:58:20.1605411Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1605536Z traceback.print_stack() 2022-11-23T02:58:20.1605666Z File "", line 1, in 2022-11-23T02:58:20.1605879Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1606023Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1606230Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1606365Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1606582Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1606685Z self.run() 2022-11-23T02:58:20.1606893Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1607039Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1607383Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1607516Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1607886Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1607993Z getattr(self, test_name)() 2022-11-23T02:58:20.1608357Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1608457Z fn() 2022-11-23T02:58:20.1608830Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1608950Z test(self, **param_kwargs) 2022-11-23T02:58:20.1609360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1609492Z return func(*args, **kwargs) 2022-11-23T02:58:20.1609750Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 107, in test_nested_wrapped_model 2022-11-23T02:58:20.1609846Z self.run_subtests( 2022-11-23T02:58:20.1610205Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1610368Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1610739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1610943Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1611327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1611451Z output = model(*input) 2022-11-23T02:58:20.1611783Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1611909Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1612291Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1612468Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1612838Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1612960Z _lazy_init(state, module) 2022-11-23T02:58:20.1613320Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1613463Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1613800Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1613911Z return func(*args, **kwargs) 2022-11-23T02:58:20.1614299Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1614401Z p_assert( 2022-11-23T02:58:20.1614739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1614864Z traceback.print_stack() 2022-11-23T02:58:20.1615106Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1615346Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1615588Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1615807Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1616040Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1616272Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1616502Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1616734Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1616965Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1617188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1617418Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1617635Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1617868Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1618150Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1618389Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1618619Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1618849Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1619078Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1619308Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1619539Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1619812Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1620042Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1620278Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1620507Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1620732Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1620962Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1621188Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1621415Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1621624Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1621857Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1622083Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1622314Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1622544Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1622771Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1622995Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1623222Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1623431Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1623664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1623893Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1624121Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1624349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1624580Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1624804Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1625034Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1625260Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1625471Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1625702Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1625930Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.1626041Z dist init r=1, world=2 2022-11-23T02:58:20.1626421Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1626756Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1627078Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1627413Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1627785Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1628106Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1628401Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1628711Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1629260Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1629589Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1629899Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1630206Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1630322Z dist init r=0, world=2 2022-11-23T02:58:20.1630644Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1630958Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1631272Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1631591Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1631885Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1632192Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1632500Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1632805Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1633117Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1633499Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1633818Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1634126Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1634228Z ok (6.214s) 2022-11-23T02:58:20.1634618Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91759 2022-11-23T02:58:20.1634898Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91760 2022-11-23T02:58:20.1635281Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.1635462Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.1635851Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.1636047Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.1636423Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.1636598Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.1636982Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.1637177Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.1637422Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.1637656Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.1638067Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.1638472Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.1638707Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.1638939Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.1639977Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.1640094Z warnings.warn( 2022-11-23T02:58:20.1641128Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.1641240Z warnings.warn( 2022-11-23T02:58:20.1642062Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1642831Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1643584Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1644389Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1645144Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1645893Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1646647Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1647398Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1648138Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1648892Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1649010Z dist init r=1, world=2 2022-11-23T02:58:20.1649104Z dist init r=0, world=2 2022-11-23T02:58:20.1649205Z ok (5.313s) 2022-11-23T02:58:20.1649588Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91842 2022-11-23T02:58:20.1649811Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91843 2022-11-23T02:58:20.1650190Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.1650412Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.1650809Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.1651004Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.1651416Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.1651603Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.1651989Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.1652182Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.1652434Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.1652685Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.1653144Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.1653544Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.1653782Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.1654000Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.1655035Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.1655154Z warnings.warn( 2022-11-23T02:58:20.1656170Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.1656281Z warnings.warn( 2022-11-23T02:58:20.1656391Z dist init r=0, world=2 2022-11-23T02:58:20.1656497Z dist init r=1, world=2 2022-11-23T02:58:20.1656596Z ok (5.513s) 2022-11-23T02:58:20.1656985Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 91925 2022-11-23T02:58:20.1657209Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 91926 2022-11-23T02:58:20.1657590Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.1657752Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.1658140Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.1658335Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.1658711Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.1658884Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.1659266Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.1659458Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.1659712Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.1659942Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.1660413Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.1660825Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.1661060Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.1661292Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.1662326Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.1662494Z warnings.warn( 2022-11-23T02:58:20.1663522Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.1663635Z warnings.warn( 2022-11-23T02:58:20.1663745Z dist init r=0, world=2 2022-11-23T02:58:20.1663854Z dist init r=1, world=2 2022-11-23T02:58:20.1663936Z ok (5.413s) 2022-11-23T02:58:20.1664323Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_no_shard (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92008 2022-11-23T02:58:20.1664550Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92009 2022-11-23T02:58:20.1664925Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.1665102Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.1665492Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.1665688Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.1666066Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.1666226Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.1666611Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.1666802Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.1667052Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.1667301Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.1667705Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.1668105Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.1668336Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.1668569Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.1669874Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.1669998Z warnings.warn( 2022-11-23T02:58:20.1671018Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.1671189Z warnings.warn( 2022-11-23T02:58:20.1671323Z File "", line 1, in 2022-11-23T02:58:20.1671539Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1671684Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1671893Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1672045Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1672263Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1672349Z self.run() 2022-11-23T02:58:20.1672555Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1672705Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1673065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1673204Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1673575Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1673701Z getattr(self, test_name)() 2022-11-23T02:58:20.1674070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1674152Z fn() 2022-11-23T02:58:20.1674526Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1674652Z test(self, **param_kwargs) 2022-11-23T02:58:20.1675017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1675144Z return func(*args, **kwargs) 2022-11-23T02:58:20.1675446Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1675566Z self.run_subtests( 2022-11-23T02:58:20.1675926Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1676074Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1676449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1676604Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1676985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1677107Z output = model(*input) 2022-11-23T02:58:20.1677439Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1677583Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1677714Z File "", line 1, in 2022-11-23T02:58:20.1678086Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1678265Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1678683Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1678813Z _lazy_init(state, module) 2022-11-23T02:58:20.1679026Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1679169Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1679528Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1679672Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1679861Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1680012Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1680420Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1680547Z return func(*args, **kwargs) 2022-11-23T02:58:20.1680766Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1680875Z self.run() 2022-11-23T02:58:20.1681268Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1681372Z p_assert( 2022-11-23T02:58:20.1681562Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1681709Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1682050Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1682178Z traceback.print_stack() 2022-11-23T02:58:20.1682517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1682653Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1683024Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1683148Z getattr(self, test_name)() 2022-11-23T02:58:20.1683498Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1683595Z fn() 2022-11-23T02:58:20.1683967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1684089Z test(self, **param_kwargs) 2022-11-23T02:58:20.1684447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1684574Z return func(*args, **kwargs) 2022-11-23T02:58:20.1684875Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1684978Z self.run_subtests( 2022-11-23T02:58:20.1685333Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1685502Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1685868Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1686021Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1686399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1686518Z output = model(*input) 2022-11-23T02:58:20.1686847Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1686990Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1687354Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1687538Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1687953Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1688082Z _lazy_init(state, module) 2022-11-23T02:58:20.1688441Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1688587Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1688929Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1689055Z return func(*args, **kwargs) 2022-11-23T02:58:20.1689424Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1689527Z p_assert( 2022-11-23T02:58:20.1689923Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1690050Z traceback.print_stack() 2022-11-23T02:58:20.1690186Z File "", line 1, in 2022-11-23T02:58:20.1690403Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1690548Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1690736Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1690890Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1691104Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1691207Z self.run() 2022-11-23T02:58:20.1691412Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1691560Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1691905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1692043Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1692394Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1692522Z getattr(self, test_name)() 2022-11-23T02:58:20.1692887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1692987Z fn() 2022-11-23T02:58:20.1693354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1693475Z test(self, **param_kwargs) 2022-11-23T02:58:20.1693833Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1693959Z return func(*args, **kwargs) 2022-11-23T02:58:20.1694245Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1694364Z self.run_subtests( 2022-11-23T02:58:20.1694720Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1694887Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1695260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1695413Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1695793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1695913Z output = model(*input) 2022-11-23T02:58:20.1696225Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1696367Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1696753Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1696932Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1697345Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1697472Z _lazy_init(state, module) 2022-11-23T02:58:20.1697834Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1697980Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1698306Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1698431Z return func(*args, **kwargs) 2022-11-23T02:58:20.1698812Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1698961Z p_assert( 2022-11-23T02:58:20.1699307Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1699430Z traceback.print_stack() 2022-11-23T02:58:20.1699560Z File "", line 1, in 2022-11-23T02:58:20.1699778Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1699904Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1700108Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1700260Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1700475Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1700577Z self.run() 2022-11-23T02:58:20.1700783Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1700931Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1701265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1701401Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1701768Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1701890Z getattr(self, test_name)() 2022-11-23T02:58:20.1702256Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1702355Z fn() 2022-11-23T02:58:20.1702726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1702850Z test(self, **param_kwargs) 2022-11-23T02:58:20.1703195Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1703318Z return func(*args, **kwargs) 2022-11-23T02:58:20.1703622Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1703737Z self.run_subtests( 2022-11-23T02:58:20.1704095Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1704259Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1704632Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1704785Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1705149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1705269Z output = model(*input) 2022-11-23T02:58:20.1705599Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1705743Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1706126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1706303Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1706719Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1706847Z _lazy_init(state, module) 2022-11-23T02:58:20.1707193Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1707338Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1707683Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1707810Z return func(*args, **kwargs) 2022-11-23T02:58:20.1708194Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1708346Z p_assert( 2022-11-23T02:58:20.1708691Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1708819Z traceback.print_stack() 2022-11-23T02:58:20.1709101Z File "", line 1, in 2022-11-23T02:58:20.1709332Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1709475Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1709680Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1709833Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1710048Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1710151Z self.run() 2022-11-23T02:58:20.1710355Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1710490Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1710848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1710982Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1711356Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1711481Z getattr(self, test_name)() 2022-11-23T02:58:20.1711844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1711942Z fn() 2022-11-23T02:58:20.1712294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1712418Z test(self, **param_kwargs) 2022-11-23T02:58:20.1712780Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1712910Z return func(*args, **kwargs) 2022-11-23T02:58:20.1713211Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1713329Z self.run_subtests( 2022-11-23T02:58:20.1713690Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1713855Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1714224Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1714363Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1714745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1714865Z output = model(*input) 2022-11-23T02:58:20.1715193Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1715340Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1715720Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1715965Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1716349Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1716453Z _lazy_init(state, module) 2022-11-23T02:58:20.1716810Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1716955Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1717301Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1717425Z return func(*args, **kwargs) 2022-11-23T02:58:20.1717806Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1717967Z p_assert( 2022-11-23T02:58:20.1718314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1718430Z traceback.print_stack() 2022-11-23T02:58:20.1718561Z File "", line 1, in 2022-11-23T02:58:20.1718774Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1718917Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1719124Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1719276Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1719496Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1719582Z self.run() 2022-11-23T02:58:20.1719789Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1719936Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1720284Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1720419Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1720793Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1720916Z getattr(self, test_name)() 2022-11-23T02:58:20.1721282Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1721362Z fn() 2022-11-23T02:58:20.1721732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1721856Z test(self, **param_kwargs) 2022-11-23T02:58:20.1722219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1722349Z return func(*args, **kwargs) 2022-11-23T02:58:20.1722652Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1722767Z self.run_subtests( 2022-11-23T02:58:20.1723129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1723275Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1723644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1723797Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1724174Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1724294Z output = model(*input) 2022-11-23T02:58:20.1724626Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1724770Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1725150Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1725350Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1725732Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1725853Z _lazy_init(state, module) 2022-11-23T02:58:20.1726210Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1726354Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1726694Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1726820Z return func(*args, **kwargs) 2022-11-23T02:58:20.1727256Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1727341Z p_assert( 2022-11-23T02:58:20.1727685Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1727814Z traceback.print_stack() 2022-11-23T02:58:20.1727940Z File "", line 1, in 2022-11-23T02:58:20.1728150Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1728293Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1728497Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1728648Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1728847Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1728949Z self.run() 2022-11-23T02:58:20.1729156Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1729303Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1729648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1729788Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1730155Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1730261Z getattr(self, test_name)() 2022-11-23T02:58:20.1730622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1730717Z fn() 2022-11-23T02:58:20.1731084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1731209Z test(self, **param_kwargs) 2022-11-23T02:58:20.1731567Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1731700Z return func(*args, **kwargs) 2022-11-23T02:58:20.1732001Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1732100Z self.run_subtests( 2022-11-23T02:58:20.1732461Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1732622Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1732993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1733148Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1733534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1733660Z output = model(*input) 2022-11-23T02:58:20.1733992Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1734117Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1734538Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1734722Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1735095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1735214Z _lazy_init(state, module) 2022-11-23T02:58:20.1735568Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1735714Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1736060Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1736214Z return func(*args, **kwargs) 2022-11-23T02:58:20.1736603Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1736706Z p_assert( 2022-11-23T02:58:20.1737054Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1737183Z traceback.print_stack() 2022-11-23T02:58:20.1737313Z File "", line 1, in 2022-11-23T02:58:20.1737528Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1737672Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1737861Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1738014Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1738228Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1738336Z self.run() 2022-11-23T02:58:20.1738540Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1738685Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1739031Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1739167Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1739520Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1739647Z getattr(self, test_name)() 2022-11-23T02:58:20.1740010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1740108Z fn() 2022-11-23T02:58:20.1740478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1740603Z test(self, **param_kwargs) 2022-11-23T02:58:20.1740970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1741081Z return func(*args, **kwargs) 2022-11-23T02:58:20.1741383Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1741498Z self.run_subtests( 2022-11-23T02:58:20.1741856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1742021Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1742389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1742542Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1742924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1743049Z output = model(*input) 2022-11-23T02:58:20.1743362Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1743506Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1743930Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1744116Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1744489Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1744610Z _lazy_init(state, module) 2022-11-23T02:58:20.1744965Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1745115Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1745439Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1745626Z return func(*args, **kwargs) 2022-11-23T02:58:20.1746010Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1746113Z p_assert( 2022-11-23T02:58:20.1746456Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1746583Z traceback.print_stack() 2022-11-23T02:58:20.1746711Z File "", line 1, in 2022-11-23T02:58:20.1746908Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1747052Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1747253Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1747406Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1747623Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1747728Z self.run() 2022-11-23T02:58:20.1747935Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1748084Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1748416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1748550Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1748916Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1749208Z getattr(self, test_name)() 2022-11-23T02:58:20.1749587Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1749686Z fn() 2022-11-23T02:58:20.1750056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1750189Z test(self, **param_kwargs) 2022-11-23T02:58:20.1750568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1750697Z return func(*args, **kwargs) 2022-11-23T02:58:20.1751004Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1751119Z self.run_subtests( 2022-11-23T02:58:20.1751479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1751643Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1752013Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1752167Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1752531Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1752654Z output = model(*input) 2022-11-23T02:58:20.1752986Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1753127Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1753577Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1753767Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1754149Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1754272Z _lazy_init(state, module) 2022-11-23T02:58:20.1754612Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1754758Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1755102Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1755285Z return func(*args, **kwargs) 2022-11-23T02:58:20.1755675Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1755781Z p_assert( 2022-11-23T02:58:20.1756123Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1756250Z traceback.print_stack() 2022-11-23T02:58:20.1756363Z File "", line 1, in 2022-11-23T02:58:20.1756574Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1756717Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1756920Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1757070Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1757287Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1757393Z self.run() 2022-11-23T02:58:20.1757584Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1757733Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1758084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1758218Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1758585Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1758706Z getattr(self, test_name)() 2022-11-23T02:58:20.1759067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1759164Z fn() 2022-11-23T02:58:20.1759517Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1759645Z test(self, **param_kwargs) 2022-11-23T02:58:20.1760009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1760136Z return func(*args, **kwargs) 2022-11-23T02:58:20.1760436Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1760551Z self.run_subtests( 2022-11-23T02:58:20.1760909Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1761074Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1761424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1761580Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1761959Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1762083Z output = model(*input) 2022-11-23T02:58:20.1762415Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1762600Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1762992Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1763172Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1763526Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1763648Z _lazy_init(state, module) 2022-11-23T02:58:20.1764005Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1764148Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1764539Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1764667Z return func(*args, **kwargs) 2022-11-23T02:58:20.1765056Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1765159Z p_assert( 2022-11-23T02:58:20.1765491Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1765620Z traceback.print_stack() 2022-11-23T02:58:20.1765748Z File "", line 1, in 2022-11-23T02:58:20.1765958Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1766101Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1766302Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1766457Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1766679Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1766765Z self.run() 2022-11-23T02:58:20.1766969Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1767120Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1767467Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1767599Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1767970Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1768091Z getattr(self, test_name)() 2022-11-23T02:58:20.1768455Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1768536Z fn() 2022-11-23T02:58:20.1768906Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1769033Z test(self, **param_kwargs) 2022-11-23T02:58:20.1769395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1769522Z return func(*args, **kwargs) 2022-11-23T02:58:20.1769822Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1769937Z self.run_subtests( 2022-11-23T02:58:20.1770295Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1770440Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1770810Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1770966Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1771350Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1771469Z output = model(*input) 2022-11-23T02:58:20.1771838Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1771987Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1772375Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1772538Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1772911Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1773034Z _lazy_init(state, module) 2022-11-23T02:58:20.1773392Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1773579Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1773926Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1774052Z return func(*args, **kwargs) 2022-11-23T02:58:20.1774441Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1774529Z p_assert( 2022-11-23T02:58:20.1774870Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1774997Z traceback.print_stack() 2022-11-23T02:58:20.1775127Z File "", line 1, in 2022-11-23T02:58:20.1775340Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1775483Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1775687Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1775827Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1776047Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1776152Z self.run() 2022-11-23T02:58:20.1776360Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1776506Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1776852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1776986Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1777353Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1777459Z getattr(self, test_name)() 2022-11-23T02:58:20.1777821Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1777919Z fn() 2022-11-23T02:58:20.1778295Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1778417Z test(self, **param_kwargs) 2022-11-23T02:58:20.1778776Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1778906Z return func(*args, **kwargs) 2022-11-23T02:58:20.1779206Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1779303Z self.run_subtests( 2022-11-23T02:58:20.1779663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1779825Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1780194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1780354Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1780737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1780853Z output = model(*input) 2022-11-23T02:58:20.1781244Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1781378Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1781766Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1781942Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1782314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1782439Z _lazy_init(state, module) 2022-11-23T02:58:20.1782795Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1782988Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1783335Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1783443Z return func(*args, **kwargs) 2022-11-23T02:58:20.1783833Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1783935Z p_assert( 2022-11-23T02:58:20.1784281Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1784408Z traceback.print_stack() 2022-11-23T02:58:20.1784539Z File "", line 1, in 2022-11-23T02:58:20.1784752Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1784894Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1785080Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1785237Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1785453Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1785556Z self.run() 2022-11-23T02:58:20.1785762Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1785907Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1786254Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1786370Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1786737Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1786861Z getattr(self, test_name)() 2022-11-23T02:58:20.1787226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1787329Z fn() 2022-11-23T02:58:20.1787695Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1787818Z test(self, **param_kwargs) 2022-11-23T02:58:20.1788179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1788289Z return func(*args, **kwargs) 2022-11-23T02:58:20.1788584Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1788696Z self.run_subtests( 2022-11-23T02:58:20.1789226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1789398Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1789774Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1789937Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1790324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1790428Z output = model(*input) 2022-11-23T02:58:20.1790830Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1790983Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1791369Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1791546Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1791923Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1792046Z _lazy_init(state, module) 2022-11-23T02:58:20.1792405Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1792613Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1792943Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1793075Z return func(*args, **kwargs) 2022-11-23T02:58:20.1793463Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1793566Z p_assert( 2022-11-23T02:58:20.1793906Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1794034Z traceback.print_stack() 2022-11-23T02:58:20.1794163Z File "", line 1, in 2022-11-23T02:58:20.1794360Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1794503Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1794712Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1794863Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1795078Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1795182Z self.run() 2022-11-23T02:58:20.1795388Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1795537Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1795867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1796001Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1796369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1796490Z getattr(self, test_name)() 2022-11-23T02:58:20.1796856Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1796959Z fn() 2022-11-23T02:58:20.1797331Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1797452Z test(self, **param_kwargs) 2022-11-23T02:58:20.1797802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1797927Z return func(*args, **kwargs) 2022-11-23T02:58:20.1798231Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1798348Z self.run_subtests( 2022-11-23T02:58:20.1798706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1798870Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1799238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1799399Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1799766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1799927Z output = model(*input) 2022-11-23T02:58:20.1800269Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1800415Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1800797Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1800975Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1801350Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1801472Z _lazy_init(state, module) 2022-11-23T02:58:20.1801883Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1802028Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1802375Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1802498Z return func(*args, **kwargs) 2022-11-23T02:58:20.1802885Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1802990Z p_assert( 2022-11-23T02:58:20.1803334Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1803462Z traceback.print_stack() 2022-11-23T02:58:20.1803574Z File "", line 1, in 2022-11-23T02:58:20.1803787Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1803931Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1804141Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1804294Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1804513Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1804617Z self.run() 2022-11-23T02:58:20.1804807Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1804953Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1805301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1805436Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1805804Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1805932Z getattr(self, test_name)() 2022-11-23T02:58:20.1806299Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1806403Z fn() 2022-11-23T02:58:20.1806757Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1806886Z test(self, **param_kwargs) 2022-11-23T02:58:20.1807249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1807375Z return func(*args, **kwargs) 2022-11-23T02:58:20.1807675Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1807791Z self.run_subtests( 2022-11-23T02:58:20.1808146Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1808309Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1808666Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1808822Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1809247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1809374Z output = model(*input) 2022-11-23T02:58:20.1809712Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1809854Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1810235Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1810413Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1810767Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1810958Z _lazy_init(state, module) 2022-11-23T02:58:20.1811322Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1811466Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1811812Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1811939Z return func(*args, **kwargs) 2022-11-23T02:58:20.1812329Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1812432Z p_assert( 2022-11-23T02:58:20.1812758Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1812885Z traceback.print_stack() 2022-11-23T02:58:20.1813016Z File "", line 1, in 2022-11-23T02:58:20.1813231Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1813379Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1813583Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1813737Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1813957Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1814043Z self.run() 2022-11-23T02:58:20.1814246Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1814395Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1814739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1814873Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1815238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1815361Z getattr(self, test_name)() 2022-11-23T02:58:20.1815713Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1815809Z fn() 2022-11-23T02:58:20.1816178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1816307Z test(self, **param_kwargs) 2022-11-23T02:58:20.1816668Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1816795Z return func(*args, **kwargs) 2022-11-23T02:58:20.1817099Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1817213Z self.run_subtests( 2022-11-23T02:58:20.1817553Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1817717Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1818093Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1818247Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1818674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1818802Z output = model(*input) 2022-11-23T02:58:20.1819139Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1819285Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1819651Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1819829Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1820202Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1820369Z _lazy_init(state, module) 2022-11-23T02:58:20.1820734Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1820879Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1821227Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1821353Z return func(*args, **kwargs) 2022-11-23T02:58:20.1821742Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1821827Z p_assert( 2022-11-23T02:58:20.1822169Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1822298Z traceback.print_stack() 2022-11-23T02:58:20.1822428Z File "", line 1, in 2022-11-23T02:58:20.1822641Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1822790Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1822995Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1823130Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1823348Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1823452Z self.run() 2022-11-23T02:58:20.1823658Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1823809Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1824152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1824285Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1824652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1824763Z getattr(self, test_name)() 2022-11-23T02:58:20.1825130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1825226Z fn() 2022-11-23T02:58:20.1825596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1825720Z test(self, **param_kwargs) 2022-11-23T02:58:20.1826083Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1826210Z return func(*args, **kwargs) 2022-11-23T02:58:20.1826512Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1826608Z self.run_subtests( 2022-11-23T02:58:20.1826965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1827132Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1827506Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1827661Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1828088Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1828214Z output = model(*input) 2022-11-23T02:58:20.1828551Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1828677Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1829232Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1829427Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1829809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1830009Z _lazy_init(state, module) 2022-11-23T02:58:20.1830372Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1830521Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1830868Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1830976Z return func(*args, **kwargs) 2022-11-23T02:58:20.1831359Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1831460Z p_assert( 2022-11-23T02:58:20.1831799Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1831926Z traceback.print_stack() 2022-11-23T02:58:20.1832056Z File "", line 1, in 2022-11-23T02:58:20.1832276Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1832404Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1832609Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1832764Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1832980Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1833083Z self.run() 2022-11-23T02:58:20.1833287Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1833433Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1833784Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1833901Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1834271Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1834400Z getattr(self, test_name)() 2022-11-23T02:58:20.1834763Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1834862Z fn() 2022-11-23T02:58:20.1835237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1835358Z test(self, **param_kwargs) 2022-11-23T02:58:20.1835724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1835833Z return func(*args, **kwargs) 2022-11-23T02:58:20.1836132Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1836246Z self.run_subtests( 2022-11-23T02:58:20.1836601Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1836767Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1837139Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1837351Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1837748Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1837851Z output = model(*input) 2022-11-23T02:58:20.1838185Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1838328Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1838714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1838894Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1839322Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1839444Z _lazy_init(state, module) 2022-11-23T02:58:20.1839806Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1839933Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1840276Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1840400Z return func(*args, **kwargs) 2022-11-23T02:58:20.1840785Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1840888Z p_assert( 2022-11-23T02:58:20.1841226Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1841358Z traceback.print_stack() 2022-11-23T02:58:20.1841492Z File "", line 1, in 2022-11-23T02:58:20.1841686Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1841829Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1842036Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1842187Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1842402Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1842506Z self.run() 2022-11-23T02:58:20.1842713Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1842844Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1843192Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1843327Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1843693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1843822Z getattr(self, test_name)() 2022-11-23T02:58:20.1844189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1844286Z fn() 2022-11-23T02:58:20.1844663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1844771Z test(self, **param_kwargs) 2022-11-23T02:58:20.1845139Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1845265Z return func(*args, **kwargs) 2022-11-23T02:58:20.1845563Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1845675Z self.run_subtests( 2022-11-23T02:58:20.1846036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1846202Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1846613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1846757Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1847140Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1847259Z output = model(*input) 2022-11-23T02:58:20.1847594Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1847737Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1848117Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1848293Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1848716Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1848837Z _lazy_init(state, module) 2022-11-23T02:58:20.1849183Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1849329Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1849674Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1849800Z return func(*args, **kwargs) 2022-11-23T02:58:20.1850187Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1850335Z p_assert( 2022-11-23T02:58:20.1850685Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1850802Z traceback.print_stack() 2022-11-23T02:58:20.1850929Z File "", line 1, in 2022-11-23T02:58:20.1851142Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1851289Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1851496Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1851649Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1851867Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1851972Z self.run() 2022-11-23T02:58:20.1852160Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1852307Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1852651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1852783Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1853155Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1853279Z getattr(self, test_name)() 2022-11-23T02:58:20.1853645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1853744Z fn() 2022-11-23T02:58:20.1854100Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1854222Z test(self, **param_kwargs) 2022-11-23T02:58:20.1854582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1854708Z return func(*args, **kwargs) 2022-11-23T02:58:20.1855009Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1855123Z self.run_subtests( 2022-11-23T02:58:20.1855486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1855650Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1856050Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1856209Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1856594Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1856714Z output = model(*input) 2022-11-23T02:58:20.1857044Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1857188Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1857618Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1857854Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1858213Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1858335Z _lazy_init(state, module) 2022-11-23T02:58:20.1858698Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1858845Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1859191Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1859320Z return func(*args, **kwargs) 2022-11-23T02:58:20.1859698Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1859800Z p_assert( 2022-11-23T02:58:20.1860122Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1860255Z traceback.print_stack() 2022-11-23T02:58:20.1860387Z File "", line 1, in 2022-11-23T02:58:20.1860601Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1860749Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1860957Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1861110Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1861310Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1861414Z self.run() 2022-11-23T02:58:20.1861617Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1861762Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1862108Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1862247Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1862618Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1862743Z getattr(self, test_name)() 2022-11-23T02:58:20.1863091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1863189Z fn() 2022-11-23T02:58:20.1863558Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1863684Z test(self, **param_kwargs) 2022-11-23T02:58:20.1864047Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1864172Z return func(*args, **kwargs) 2022-11-23T02:58:20.1864475Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1864593Z self.run_subtests( 2022-11-23T02:58:20.1864932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1865094Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1865508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1865666Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1866052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1866173Z output = model(*input) 2022-11-23T02:58:20.1866505Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1866648Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1867011Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1867236Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1867613Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1867738Z _lazy_init(state, module) 2022-11-23T02:58:20.1868097Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1868240Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1868581Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1868706Z return func(*args, **kwargs) 2022-11-23T02:58:20.1869248Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1869363Z p_assert( 2022-11-23T02:58:20.1869713Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1869849Z traceback.print_stack() 2022-11-23T02:58:20.1869980Z File "", line 1, in 2022-11-23T02:58:20.1870193Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1870343Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1870550Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1870685Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1870903Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1871007Z self.run() 2022-11-23T02:58:20.1871213Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1871362Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1871712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1871851Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1872204Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1872331Z getattr(self, test_name)() 2022-11-23T02:58:20.1872701Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1872802Z fn() 2022-11-23T02:58:20.1873173Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1873298Z test(self, **param_kwargs) 2022-11-23T02:58:20.1873659Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1873788Z return func(*args, **kwargs) 2022-11-23T02:58:20.1874072Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1874192Z self.run_subtests( 2022-11-23T02:58:20.1874549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1874781Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1875163Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1875321Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1875703Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1875825Z output = model(*input) 2022-11-23T02:58:20.1876137Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1876282Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1876664Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1876918Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1877302Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1877425Z _lazy_init(state, module) 2022-11-23T02:58:20.1877784Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1877932Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1878276Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1878386Z return func(*args, **kwargs) 2022-11-23T02:58:20.1878773Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1878880Z p_assert( 2022-11-23T02:58:20.1879227Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1879355Z traceback.print_stack() 2022-11-23T02:58:20.1879486Z File "", line 1, in 2022-11-23T02:58:20.1879704Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1879831Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1880037Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1880190Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1880406Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1880512Z self.run() 2022-11-23T02:58:20.1880719Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1880868Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1881217Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1881338Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1881709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1881838Z getattr(self, test_name)() 2022-11-23T02:58:20.1882206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1882305Z fn() 2022-11-23T02:58:20.1882676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1882800Z test(self, **param_kwargs) 2022-11-23T02:58:20.1883162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1883271Z return func(*args, **kwargs) 2022-11-23T02:58:20.1883572Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1883692Z self.run_subtests( 2022-11-23T02:58:20.1884049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1884257Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1884633Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1884789Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1885171Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1885273Z output = model(*input) 2022-11-23T02:58:20.1885606Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1885751Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1886183Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1886363Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1886740Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1886863Z _lazy_init(state, module) 2022-11-23T02:58:20.1887223Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1887351Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1887696Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1887822Z return func(*args, **kwargs) 2022-11-23T02:58:20.1888206Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1894752Z p_assert( 2022-11-23T02:58:20.1895180Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1895315Z traceback.print_stack() 2022-11-23T02:58:20.1895450Z File "", line 1, in 2022-11-23T02:58:20.1895658Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1895804Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1896012Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1896165Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1896383Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1896485Z self.run() 2022-11-23T02:58:20.1896691Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1896837Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1897181Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1897317Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1897694Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1897819Z getattr(self, test_name)() 2022-11-23T02:58:20.1898187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1898284Z fn() 2022-11-23T02:58:20.1898655Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1898776Z test(self, **param_kwargs) 2022-11-23T02:58:20.1899124Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1899249Z return func(*args, **kwargs) 2022-11-23T02:58:20.1899558Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1899676Z self.run_subtests( 2022-11-23T02:58:20.1900116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1900289Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1900665Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1900821Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1901190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1901313Z output = model(*input) 2022-11-23T02:58:20.1901646Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1901840Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1902228Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1902405Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1902782Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1902904Z _lazy_init(state, module) 2022-11-23T02:58:20.1903245Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1903388Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1903728Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1903854Z return func(*args, **kwargs) 2022-11-23T02:58:20.1904239Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1904345Z p_assert( 2022-11-23T02:58:20.1904686Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1904813Z traceback.print_stack() 2022-11-23T02:58:20.1905564Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1906327Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1907083Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1907850Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1908595Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1909630Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.1909843Z dist init r=1, world=2 2022-11-23T02:58:20.1910192Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1910516Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1910832Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1911145Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1911512Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1911829Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1912157Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1912476Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1912789Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1913085Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1913397Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1913704Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.1913816Z dist init r=0, world=2 2022-11-23T02:58:20.1914131Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1914441Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1914755Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1915064Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1915370Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1915675Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1915982Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1916271Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1916581Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1916926Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1917242Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1917547Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.1917647Z ok (5.513s) 2022-11-23T02:58:20.1918029Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92091 2022-11-23T02:58:20.1918292Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92092 2022-11-23T02:58:20.1918692Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.1918874Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.1919246Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.1919442Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.1919818Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.1919992Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.1920378Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.1920576Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.1920829Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.1921079Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.1921491Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.1921876Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.1922112Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.1922342Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.1923387Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.1923502Z warnings.warn( 2022-11-23T02:58:20.1924523Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.1924636Z warnings.warn( 2022-11-23T02:58:20.1924765Z File "", line 1, in 2022-11-23T02:58:20.1924984Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1925127Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1925377Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1925518Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1925739Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1925844Z self.run() 2022-11-23T02:58:20.1926053Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1926202Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1926556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1926689Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1927090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1927211Z getattr(self, test_name)() 2022-11-23T02:58:20.1927584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1927682Z fn() 2022-11-23T02:58:20.1928053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1928175Z test(self, **param_kwargs) 2022-11-23T02:58:20.1928534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1928660Z return func(*args, **kwargs) 2022-11-23T02:58:20.1928947Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1929060Z self.run_subtests( 2022-11-23T02:58:20.1929421Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1929585Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1929960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1930115Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1930494Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1930614Z output = model(*input) 2022-11-23T02:58:20.1930928Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1931069Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1931447Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1931629Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1932006Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1932127Z _lazy_init(state, module) 2022-11-23T02:58:20.1932487Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1932631Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1932960Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1933087Z return func(*args, **kwargs) 2022-11-23T02:58:20.1933471Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1933573Z p_assert( 2022-11-23T02:58:20.1933913Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1934045Z traceback.print_stack() 2022-11-23T02:58:20.1934175Z File "", line 1, in 2022-11-23T02:58:20.1934383Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1934553Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1934764Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1934919Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1935130Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1935236Z self.run() 2022-11-23T02:58:20.1935438Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1935584Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1935934Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1936094Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1936463Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1936589Z getattr(self, test_name)() 2022-11-23T02:58:20.1936951Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1937050Z fn() 2022-11-23T02:58:20.1937418Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1937540Z test(self, **param_kwargs) 2022-11-23T02:58:20.1937883Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1938007Z return func(*args, **kwargs) 2022-11-23T02:58:20.1938307Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1938421Z self.run_subtests( 2022-11-23T02:58:20.1938773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1938932Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1939302Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1939455Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1939835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1939938Z output = model(*input) 2022-11-23T02:58:20.1940269Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1940409Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1940790Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1940972Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1941346Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1941469Z _lazy_init(state, module) 2022-11-23T02:58:20.1941827Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1941959Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1942303Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1942429Z return func(*args, **kwargs) 2022-11-23T02:58:20.1942810Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1942912Z p_assert( 2022-11-23T02:58:20.1943248Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1943377Z traceback.print_stack() 2022-11-23T02:58:20.1943504Z File "", line 1, in 2022-11-23T02:58:20.1943700Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1943888Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1944102Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1944255Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1944472Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1944575Z self.run() 2022-11-23T02:58:20.1944778Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1944907Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1945258Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1945454Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1945824Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1945949Z getattr(self, test_name)() 2022-11-23T02:58:20.1946316Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1946415Z fn() 2022-11-23T02:58:20.1946781Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1946888Z test(self, **param_kwargs) 2022-11-23T02:58:20.1947249Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1947376Z return func(*args, **kwargs) 2022-11-23T02:58:20.1947677Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1947795Z self.run_subtests( 2022-11-23T02:58:20.1948147Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1948314Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1948686Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1948822Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1949477Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1949605Z output = model(*input) 2022-11-23T02:58:20.1949948Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1950091Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1950508Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1950696Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1951072Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1951178Z _lazy_init(state, module) 2022-11-23T02:58:20.1951537Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1951680Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1952023Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1952147Z return func(*args, **kwargs) 2022-11-23T02:58:20.1952531Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1952633Z p_assert( 2022-11-23T02:58:20.1952978Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1953089Z traceback.print_stack() 2022-11-23T02:58:20.1953216Z File "", line 1, in 2022-11-23T02:58:20.1953513Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1953666Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1953866Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1954014Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1954231Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1954317Z self.run() 2022-11-23T02:58:20.1954522Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1954668Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1955023Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1955214Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1955586Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1955714Z getattr(self, test_name)() 2022-11-23T02:58:20.1956082Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1956164Z fn() 2022-11-23T02:58:20.1956534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1956656Z test(self, **param_kwargs) 2022-11-23T02:58:20.1957018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1957143Z return func(*args, **kwargs) 2022-11-23T02:58:20.1957445Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1957565Z self.run_subtests( 2022-11-23T02:58:20.1957924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1958075Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1958449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1958603Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1958985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1959103Z output = model(*input) 2022-11-23T02:58:20.1959435Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1959576Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1959969Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1960130Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1960506Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1960629Z _lazy_init(state, module) 2022-11-23T02:58:20.1960988Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1961131Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1961472Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1961596Z return func(*args, **kwargs) 2022-11-23T02:58:20.1961980Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1962069Z p_assert( 2022-11-23T02:58:20.1962408Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1962533Z traceback.print_stack() 2022-11-23T02:58:20.1962662Z File "", line 1, in 2022-11-23T02:58:20.1962918Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1963069Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1963275Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1963428Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1963627Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1963731Z self.run() 2022-11-23T02:58:20.1963936Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1964082Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1964502Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1964639Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1965010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1965136Z getattr(self, test_name)() 2022-11-23T02:58:20.1965482Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1965580Z fn() 2022-11-23T02:58:20.1965948Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1966070Z test(self, **param_kwargs) 2022-11-23T02:58:20.1966429Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1966555Z return func(*args, **kwargs) 2022-11-23T02:58:20.1966859Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1966972Z self.run_subtests( 2022-11-23T02:58:20.1967315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1967481Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1967855Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1968009Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1968388Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1968508Z output = model(*input) 2022-11-23T02:58:20.1968835Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1968982Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1969348Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1969527Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1969903Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1970025Z _lazy_init(state, module) 2022-11-23T02:58:20.1970382Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1970526Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1970870Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1970994Z return func(*args, **kwargs) 2022-11-23T02:58:20.1971362Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1971469Z p_assert( 2022-11-23T02:58:20.1971812Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1971938Z traceback.print_stack() 2022-11-23T02:58:20.1972111Z File "", line 1, in 2022-11-23T02:58:20.1972331Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1972475Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1972662Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1972816Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1973031Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1973134Z self.run() 2022-11-23T02:58:20.1973341Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1973487Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1973879Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1974015Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1974369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1974493Z getattr(self, test_name)() 2022-11-23T02:58:20.1974861Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1974956Z fn() 2022-11-23T02:58:20.1975320Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1975443Z test(self, **param_kwargs) 2022-11-23T02:58:20.1975799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1975923Z return func(*args, **kwargs) 2022-11-23T02:58:20.1976213Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1976326Z self.run_subtests( 2022-11-23T02:58:20.1976681Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1976842Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1977212Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1977365Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1977746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1977864Z output = model(*input) 2022-11-23T02:58:20.1978177Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1978324Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1978707Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1978887Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1979259Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1979379Z _lazy_init(state, module) 2022-11-23T02:58:20.1979735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1979877Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1980204Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1980331Z return func(*args, **kwargs) 2022-11-23T02:58:20.1980714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1980821Z p_assert( 2022-11-23T02:58:20.1981164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1981331Z traceback.print_stack() 2022-11-23T02:58:20.1981468Z File "", line 1, in 2022-11-23T02:58:20.1981678Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1981804Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1982009Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1982160Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1982374Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1982478Z self.run() 2022-11-23T02:58:20.1982681Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1982871Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1983205Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1983339Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1983709Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1983835Z getattr(self, test_name)() 2022-11-23T02:58:20.1984197Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1984298Z fn() 2022-11-23T02:58:20.1984671Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1984793Z test(self, **param_kwargs) 2022-11-23T02:58:20.1985139Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1985267Z return func(*args, **kwargs) 2022-11-23T02:58:20.1985563Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1985677Z self.run_subtests( 2022-11-23T02:58:20.1986038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1986202Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1986569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1986723Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1987087Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1987209Z output = model(*input) 2022-11-23T02:58:20.1987539Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1987687Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1988067Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1988248Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1988622Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1988741Z _lazy_init(state, module) 2022-11-23T02:58:20.1989279Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1989412Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1989763Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1989890Z return func(*args, **kwargs) 2022-11-23T02:58:20.1990283Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1990385Z p_assert( 2022-11-23T02:58:20.1990727Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.1990927Z traceback.print_stack() 2022-11-23T02:58:20.1991046Z File "", line 1, in 2022-11-23T02:58:20.1991261Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.1991403Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.1991608Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.1991760Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.1991976Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.1992079Z self.run() 2022-11-23T02:58:20.1992284Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.1992475Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.1992826Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.1992964Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.1993330Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.1993455Z getattr(self, test_name)() 2022-11-23T02:58:20.1993814Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.1993914Z fn() 2022-11-23T02:58:20.1994280Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.1994385Z test(self, **param_kwargs) 2022-11-23T02:58:20.1994745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.1994873Z return func(*args, **kwargs) 2022-11-23T02:58:20.1995173Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.1995288Z self.run_subtests( 2022-11-23T02:58:20.1995645Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.1995811Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.1996180Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.1996317Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.1996700Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.1996818Z output = model(*input) 2022-11-23T02:58:20.1997154Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.1997296Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.1997678Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.1997855Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.1998227Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.1998330Z _lazy_init(state, module) 2022-11-23T02:58:20.1998687Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.1998832Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.1999173Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.1999303Z return func(*args, **kwargs) 2022-11-23T02:58:20.1999687Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.1999790Z p_assert( 2022-11-23T02:58:20.2000173Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2000288Z traceback.print_stack() 2022-11-23T02:58:20.2000420Z File "", line 1, in 2022-11-23T02:58:20.2000635Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2000780Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2000984Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2001139Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2001352Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2001437Z self.run() 2022-11-23T02:58:20.2001683Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2001833Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2002180Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2002317Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2002683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2002805Z getattr(self, test_name)() 2022-11-23T02:58:20.2003167Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2003246Z fn() 2022-11-23T02:58:20.2003614Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2003736Z test(self, **param_kwargs) 2022-11-23T02:58:20.2004101Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2004226Z return func(*args, **kwargs) 2022-11-23T02:58:20.2004530Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2004644Z self.run_subtests( 2022-11-23T02:58:20.2005000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2005146Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2005520Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2005673Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2006053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2006178Z output = model(*input) 2022-11-23T02:58:20.2006508Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2006650Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2007034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2007195Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2007569Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2007689Z _lazy_init(state, module) 2022-11-23T02:58:20.2008043Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2008184Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2008524Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2008652Z return func(*args, **kwargs) 2022-11-23T02:58:20.2009035Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2009120Z p_assert( 2022-11-23T02:58:20.2009504Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2009639Z traceback.print_stack() 2022-11-23T02:58:20.2009770Z File "", line 1, in 2022-11-23T02:58:20.2009983Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2010127Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2010331Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2010483Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2010681Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2010841Z self.run() 2022-11-23T02:58:20.2011047Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2011194Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2011546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2011681Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2012048Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2012154Z getattr(self, test_name)() 2022-11-23T02:58:20.2012518Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2012615Z fn() 2022-11-23T02:58:20.2012985Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2013108Z test(self, **param_kwargs) 2022-11-23T02:58:20.2013475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2013601Z return func(*args, **kwargs) 2022-11-23T02:58:20.2013904Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2014002Z self.run_subtests( 2022-11-23T02:58:20.2014356Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2014518Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2014887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2015039Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2015420Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2015545Z output = model(*input) 2022-11-23T02:58:20.2015875Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2016000Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2016388Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2016565Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2016939Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2017060Z _lazy_init(state, module) 2022-11-23T02:58:20.2017411Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2017553Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2017892Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2018021Z return func(*args, **kwargs) 2022-11-23T02:58:20.2018393Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2018539Z p_assert( 2022-11-23T02:58:20.2018892Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2019022Z traceback.print_stack() 2022-11-23T02:58:20.2019152Z File "", line 1, in 2022-11-23T02:58:20.2019363Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2019508Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2019696Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2019847Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2020062Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2020214Z self.run() 2022-11-23T02:58:20.2020422Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2020572Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2020924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2021060Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2021414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2021538Z getattr(self, test_name)() 2022-11-23T02:58:20.2021903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2022001Z fn() 2022-11-23T02:58:20.2022370Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2022497Z test(self, **param_kwargs) 2022-11-23T02:58:20.2022860Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2022985Z return func(*args, **kwargs) 2022-11-23T02:58:20.2023270Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2023385Z self.run_subtests( 2022-11-23T02:58:20.2023742Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2023905Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2024272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2024426Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2024803Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2024928Z output = model(*input) 2022-11-23T02:58:20.2025246Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2025390Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2025771Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2025950Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2026323Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2026443Z _lazy_init(state, module) 2022-11-23T02:58:20.2026798Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2026939Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2027269Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2027395Z return func(*args, **kwargs) 2022-11-23T02:58:20.2027822Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2027932Z p_assert( 2022-11-23T02:58:20.2028274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2028404Z traceback.print_stack() 2022-11-23T02:58:20.2028534Z File "", line 1, in 2022-11-23T02:58:20.2028745Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2028871Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2029258Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2029419Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2029715Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2029820Z self.run() 2022-11-23T02:58:20.2030025Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2030177Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2030514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2030649Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2031016Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2031140Z getattr(self, test_name)() 2022-11-23T02:58:20.2031503Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2031599Z fn() 2022-11-23T02:58:20.2031965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2032093Z test(self, **param_kwargs) 2022-11-23T02:58:20.2032437Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2032559Z return func(*args, **kwargs) 2022-11-23T02:58:20.2032862Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2032978Z self.run_subtests( 2022-11-23T02:58:20.2033335Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2033498Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2033866Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2034019Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2034389Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2034510Z output = model(*input) 2022-11-23T02:58:20.2034845Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2034987Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2035370Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2035549Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2035918Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2036038Z _lazy_init(state, module) 2022-11-23T02:58:20.2036377Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2036526Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2036866Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2036991Z return func(*args, **kwargs) 2022-11-23T02:58:20.2037430Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2037543Z p_assert( 2022-11-23T02:58:20.2037887Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2038018Z traceback.print_stack() 2022-11-23T02:58:20.2038129Z File "", line 1, in 2022-11-23T02:58:20.2038342Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2038484Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2038691Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2038888Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2039104Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2039206Z self.run() 2022-11-23T02:58:20.2039413Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2039546Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2039895Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2040032Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2040400Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2040524Z getattr(self, test_name)() 2022-11-23T02:58:20.2040887Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2040988Z fn() 2022-11-23T02:58:20.2041343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2041473Z test(self, **param_kwargs) 2022-11-23T02:58:20.2041834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2041962Z return func(*args, **kwargs) 2022-11-23T02:58:20.2042264Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2042377Z self.run_subtests( 2022-11-23T02:58:20.2042734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2042897Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2043251Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2043410Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2043789Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2043907Z output = model(*input) 2022-11-23T02:58:20.2044239Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2044382Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2044765Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2044940Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2045313Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2045417Z _lazy_init(state, module) 2022-11-23T02:58:20.2045772Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2045918Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2046262Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2046387Z return func(*args, **kwargs) 2022-11-23T02:58:20.2046818Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2046928Z p_assert( 2022-11-23T02:58:20.2047279Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2047389Z traceback.print_stack() 2022-11-23T02:58:20.2047520Z File "", line 1, in 2022-11-23T02:58:20.2047732Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2047878Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2048083Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2048282Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2048502Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2048589Z self.run() 2022-11-23T02:58:20.2048800Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2048950Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2049303Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2049437Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2049805Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2049930Z getattr(self, test_name)() 2022-11-23T02:58:20.2050298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2050424Z fn() 2022-11-23T02:58:20.2050801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2050928Z test(self, **param_kwargs) 2022-11-23T02:58:20.2051294Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2051420Z return func(*args, **kwargs) 2022-11-23T02:58:20.2051720Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2051833Z self.run_subtests( 2022-11-23T02:58:20.2052187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2052334Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2052704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2052860Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2053240Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2053362Z output = model(*input) 2022-11-23T02:58:20.2053696Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2053838Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2054217Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2054377Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2054751Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2054871Z _lazy_init(state, module) 2022-11-23T02:58:20.2055225Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2055368Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2055710Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2055883Z return func(*args, **kwargs) 2022-11-23T02:58:20.2056282Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2056368Z p_assert( 2022-11-23T02:58:20.2056712Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2056839Z traceback.print_stack() 2022-11-23T02:58:20.2056967Z File "", line 1, in 2022-11-23T02:58:20.2057180Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2057323Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2057582Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2057718Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2057934Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2058038Z self.run() 2022-11-23T02:58:20.2058246Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2058394Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2058745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2058880Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2059252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2059358Z getattr(self, test_name)() 2022-11-23T02:58:20.2059722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2059824Z fn() 2022-11-23T02:58:20.2060194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2060315Z test(self, **param_kwargs) 2022-11-23T02:58:20.2060682Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2060809Z return func(*args, **kwargs) 2022-11-23T02:58:20.2061110Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2061205Z self.run_subtests( 2022-11-23T02:58:20.2061564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2061728Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2062096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2062254Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2062635Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2062757Z output = model(*input) 2022-11-23T02:58:20.2063091Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2063217Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2063598Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2063773Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2064142Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2064263Z _lazy_init(state, module) 2022-11-23T02:58:20.2064623Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2064767Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2065158Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2065274Z return func(*args, **kwargs) 2022-11-23T02:58:20.2065663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2065766Z p_assert( 2022-11-23T02:58:20.2066108Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2066236Z traceback.print_stack() 2022-11-23T02:58:20.2066365Z File "", line 1, in 2022-11-23T02:58:20.2066575Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2066717Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2066956Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2067111Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2067328Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2067435Z self.run() 2022-11-23T02:58:20.2067638Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2067783Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2068129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2068245Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2068614Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2068741Z getattr(self, test_name)() 2022-11-23T02:58:20.2069279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2069394Z fn() 2022-11-23T02:58:20.2069770Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2069895Z test(self, **param_kwargs) 2022-11-23T02:58:20.2070261Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2070368Z return func(*args, **kwargs) 2022-11-23T02:58:20.2070668Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2070780Z self.run_subtests( 2022-11-23T02:58:20.2071135Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2071297Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2071670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2071822Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2072207Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2072328Z output = model(*input) 2022-11-23T02:58:20.2072641Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2072781Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2073161Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2073340Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2073709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2073832Z _lazy_init(state, module) 2022-11-23T02:58:20.2074188Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2074331Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2074723Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2074860Z return func(*args, **kwargs) 2022-11-23T02:58:20.2075249Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2075350Z p_assert( 2022-11-23T02:58:20.2075687Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2075813Z traceback.print_stack() 2022-11-23T02:58:20.2075942Z File "", line 1, in 2022-11-23T02:58:20.2076136Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2076353Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2076561Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2076714Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2076932Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2077036Z self.run() 2022-11-23T02:58:20.2077241Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2077387Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2077721Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2077854Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2078222Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2078348Z getattr(self, test_name)() 2022-11-23T02:58:20.2078717Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2078814Z fn() 2022-11-23T02:58:20.2079184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2079309Z test(self, **param_kwargs) 2022-11-23T02:58:20.2079656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2079781Z return func(*args, **kwargs) 2022-11-23T02:58:20.2080082Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2080194Z self.run_subtests( 2022-11-23T02:58:20.2080549Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2080832Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2081205Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2081358Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2081724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2081847Z output = model(*input) 2022-11-23T02:58:20.2082178Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2082321Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2082700Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2082876Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2083249Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2083372Z _lazy_init(state, module) 2022-11-23T02:58:20.2083716Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2083859Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2084245Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2084379Z return func(*args, **kwargs) 2022-11-23T02:58:20.2084770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2084878Z p_assert( 2022-11-23T02:58:20.2085219Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2085349Z traceback.print_stack() 2022-11-23T02:58:20.2085461Z File "", line 1, in 2022-11-23T02:58:20.2085675Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2085868Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2086075Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2086229Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2086449Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2086554Z self.run() 2022-11-23T02:58:20.2086742Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2086892Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2087242Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2087377Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2087747Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2087875Z getattr(self, test_name)() 2022-11-23T02:58:20.2088239Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2088336Z fn() 2022-11-23T02:58:20.2088697Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2088822Z test(self, **param_kwargs) 2022-11-23T02:58:20.2089185Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2089312Z return func(*args, **kwargs) 2022-11-23T02:58:20.2089609Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2089722Z self.run_subtests( 2022-11-23T02:58:20.2090078Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2090249Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2090600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2090755Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2091141Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2091262Z output = model(*input) 2022-11-23T02:58:20.2091595Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2091738Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2092117Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2092294Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2092649Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2092773Z _lazy_init(state, module) 2022-11-23T02:58:20.2093129Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2093321Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2093677Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2093806Z return func(*args, **kwargs) 2022-11-23T02:58:20.2094191Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2094296Z p_assert( 2022-11-23T02:58:20.2094619Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2094747Z traceback.print_stack() 2022-11-23T02:58:20.2094878Z File "", line 1, in 2022-11-23T02:58:20.2095153Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2095299Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2095505Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2095660Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2095879Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2095965Z self.run() 2022-11-23T02:58:20.2096170Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2096318Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2096667Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2096801Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2097170Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2097300Z getattr(self, test_name)() 2022-11-23T02:58:20.2097664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2097743Z fn() 2022-11-23T02:58:20.2098115Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2098237Z test(self, **param_kwargs) 2022-11-23T02:58:20.2098599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2098723Z return func(*args, **kwargs) 2022-11-23T02:58:20.2099024Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2099137Z self.run_subtests( 2022-11-23T02:58:20.2099480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2099649Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2100018Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2100175Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2100556Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2100676Z output = model(*input) 2022-11-23T02:58:20.2101004Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2101148Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2101528Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2101690Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2102070Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2102191Z _lazy_init(state, module) 2022-11-23T02:58:20.2102595Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2102745Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2103094Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2103221Z return func(*args, **kwargs) 2022-11-23T02:58:20.2103606Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2103691Z p_assert( 2022-11-23T02:58:20.2104034Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2104162Z traceback.print_stack() 2022-11-23T02:58:20.2104341Z File "", line 1, in 2022-11-23T02:58:20.2104558Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2104703Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2104914Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2105048Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2105267Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2105370Z self.run() 2022-11-23T02:58:20.2105574Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2105722Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2106068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2106201Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2106569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2106680Z getattr(self, test_name)() 2022-11-23T02:58:20.2107048Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2107147Z fn() 2022-11-23T02:58:20.2107522Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2107646Z test(self, **param_kwargs) 2022-11-23T02:58:20.2108010Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2108134Z return func(*args, **kwargs) 2022-11-23T02:58:20.2108432Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2108529Z self.run_subtests( 2022-11-23T02:58:20.2108886Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2109224Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2109611Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2109772Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2110157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2110276Z output = model(*input) 2022-11-23T02:58:20.2110607Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2110733Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2111112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2111287Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2111669Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2111790Z _lazy_init(state, module) 2022-11-23T02:58:20.2112213Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2112368Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2112716Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2112826Z return func(*args, **kwargs) 2022-11-23T02:58:20.2113216Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2113318Z p_assert( 2022-11-23T02:58:20.2113658Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2113849Z traceback.print_stack() 2022-11-23T02:58:20.2113981Z File "", line 1, in 2022-11-23T02:58:20.2114195Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2114341Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2114532Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2114684Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2114901Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2115004Z self.run() 2022-11-23T02:58:20.2115210Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2115356Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2115708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2115824Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2116197Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2116318Z getattr(self, test_name)() 2022-11-23T02:58:20.2116685Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2116783Z fn() 2022-11-23T02:58:20.2117151Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2117271Z test(self, **param_kwargs) 2022-11-23T02:58:20.2117632Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2117741Z return func(*args, **kwargs) 2022-11-23T02:58:20.2118039Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2118152Z self.run_subtests( 2022-11-23T02:58:20.2118516Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2118679Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2119052Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2119205Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2119584Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2119687Z output = model(*input) 2022-11-23T02:58:20.2120011Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2120150Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2120530Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2120711Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2121083Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2121203Z _lazy_init(state, module) 2022-11-23T02:58:20.2121604Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2121736Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2122086Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2122211Z return func(*args, **kwargs) 2022-11-23T02:58:20.2122593Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2122697Z p_assert( 2022-11-23T02:58:20.2123035Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2123206Z traceback.print_stack() 2022-11-23T02:58:20.2123336Z File "", line 1, in 2022-11-23T02:58:20.2123532Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2123673Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2123880Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2124033Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2124245Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2124345Z self.run() 2022-11-23T02:58:20.2124551Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2124697Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2125028Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2125168Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2125534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2125659Z getattr(self, test_name)() 2022-11-23T02:58:20.2126027Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2126124Z fn() 2022-11-23T02:58:20.2126492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2126598Z test(self, **param_kwargs) 2022-11-23T02:58:20.2126958Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2127081Z return func(*args, **kwargs) 2022-11-23T02:58:20.2127382Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2127499Z self.run_subtests( 2022-11-23T02:58:20.2127854Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2128013Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2128384Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2128535Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2128902Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2129021Z output = model(*input) 2022-11-23T02:58:20.2129352Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2129494Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2129876Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2130059Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2130429Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2130594Z _lazy_init(state, module) 2022-11-23T02:58:20.2130946Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2131085Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2131426Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2131552Z return func(*args, **kwargs) 2022-11-23T02:58:20.2131936Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2132040Z p_assert( 2022-11-23T02:58:20.2132382Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2132559Z traceback.print_stack() 2022-11-23T02:58:20.2132671Z File "", line 1, in 2022-11-23T02:58:20.2132885Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2133030Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2133233Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2133384Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2133600Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2133701Z self.run() 2022-11-23T02:58:20.2133889Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2134033Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2134381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2134520Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2134886Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2135010Z getattr(self, test_name)() 2022-11-23T02:58:20.2135379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2135481Z fn() 2022-11-23T02:58:20.2135832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2135956Z test(self, **param_kwargs) 2022-11-23T02:58:20.2136314Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2136437Z return func(*args, **kwargs) 2022-11-23T02:58:20.2136734Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2136852Z self.run_subtests( 2022-11-23T02:58:20.2137205Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2137366Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2137721Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2137874Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2138255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2138375Z output = model(*input) 2022-11-23T02:58:20.2138702Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2138843Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2139223Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2139403Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2139802Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2139931Z _lazy_init(state, module) 2022-11-23T02:58:20.2140294Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2140438Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2140784Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2140910Z return func(*args, **kwargs) 2022-11-23T02:58:20.2141292Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2141393Z p_assert( 2022-11-23T02:58:20.2141785Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2141912Z traceback.print_stack() 2022-11-23T02:58:20.2142041Z File "", line 1, in 2022-11-23T02:58:20.2142256Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2142400Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2142601Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2142752Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2142967Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2143054Z self.run() 2022-11-23T02:58:20.2143253Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2143401Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2143746Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2143885Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2144252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2144378Z getattr(self, test_name)() 2022-11-23T02:58:20.2144727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2144824Z fn() 2022-11-23T02:58:20.2145190Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2145313Z test(self, **param_kwargs) 2022-11-23T02:58:20.2145672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2145798Z return func(*args, **kwargs) 2022-11-23T02:58:20.2146095Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2146213Z self.run_subtests( 2022-11-23T02:58:20.2146551Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2146721Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2147090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2147239Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2147620Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2147737Z output = model(*input) 2022-11-23T02:58:20.2148066Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2148207Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2148578Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2148759Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2149392Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2149526Z _lazy_init(state, module) 2022-11-23T02:58:20.2149887Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2150030Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2150407Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2150537Z return func(*args, **kwargs) 2022-11-23T02:58:20.2150903Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2151073Z p_assert( 2022-11-23T02:58:20.2151416Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2151547Z traceback.print_stack() 2022-11-23T02:58:20.2151659Z dist init r=0, world=2 2022-11-23T02:58:20.2152000Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2152330Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2152646Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2152960Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2153275Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2153571Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2153880Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2154187Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2154494Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2154800Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2155115Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2155424Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2155535Z dist init r=1, world=2 2022-11-23T02:58:20.2155864Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2156183Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2156494Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2156858Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2157185Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2157502Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2157864Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2158174Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2158525Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2158834Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2159146Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2159452Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2159553Z ok (5.813s) 2022-11-23T02:58:20.2159946Z test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92174 2022-11-23T02:58:20.2160159Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92175 2022-11-23T02:58:20.2160553Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.2160734Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.2161123Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.2161321Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.2161700Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.2161876Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.2162259Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.2162455Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.2162689Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.2162946Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.2163354Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.2163758Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.2163990Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.2164222Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.2165302Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.2165426Z warnings.warn( 2022-11-23T02:58:20.2166459Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.2166570Z warnings.warn( 2022-11-23T02:58:20.2166748Z File "", line 1, in 2022-11-23T02:58:20.2166950Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2167093Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2167301Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2167456Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2167674Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2167780Z self.run() 2022-11-23T02:58:20.2167983Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2168113Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2168462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2168597Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2168967Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2169097Z getattr(self, test_name)() 2022-11-23T02:58:20.2169466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2169567Z fn() 2022-11-23T02:58:20.2169940Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2170047Z test(self, **param_kwargs) 2022-11-23T02:58:20.2170410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2170537Z return func(*args, **kwargs) 2022-11-23T02:58:20.2170838Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2170952Z self.run_subtests( 2022-11-23T02:58:20.2171312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2171479Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2171849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2171986Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2172369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2172491Z output = model(*input) 2022-11-23T02:58:20.2172820Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2172962Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2173345Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2173521Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2173901Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2174021Z _lazy_init(state, module) 2022-11-23T02:58:20.2174407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2174560Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2174904Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2175032Z return func(*args, **kwargs) 2022-11-23T02:58:20.2175419Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2175522Z p_assert( 2022-11-23T02:58:20.2175864Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2176021Z traceback.print_stack() 2022-11-23T02:58:20.2176151Z File "", line 1, in 2022-11-23T02:58:20.2176362Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2176507Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2176715Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2176864Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2177079Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2177181Z self.run() 2022-11-23T02:58:20.2177368Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2177518Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2177865Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2177997Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2178369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2178494Z getattr(self, test_name)() 2022-11-23T02:58:20.2178859Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2178959Z fn() 2022-11-23T02:58:20.2179312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2179439Z test(self, **param_kwargs) 2022-11-23T02:58:20.2179798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2179923Z return func(*args, **kwargs) 2022-11-23T02:58:20.2180225Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2180337Z self.run_subtests( 2022-11-23T02:58:20.2180698Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2180860Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2181219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2181372Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2181757Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2181877Z output = model(*input) 2022-11-23T02:58:20.2182206Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2182349Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2182727Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2182909Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2183264Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2183386Z _lazy_init(state, module) 2022-11-23T02:58:20.2183788Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2183939Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2184287Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2184413Z return func(*args, **kwargs) 2022-11-23T02:58:20.2184796Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2184899Z p_assert( 2022-11-23T02:58:20.2185223Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2185396Z traceback.print_stack() 2022-11-23T02:58:20.2185527Z File "", line 1, in 2022-11-23T02:58:20.2185741Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2185890Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2186096Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2186246Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2186445Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2186548Z self.run() 2022-11-23T02:58:20.2186753Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2186900Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2187244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2187384Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2187749Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2187873Z getattr(self, test_name)() 2022-11-23T02:58:20.2188224Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2188324Z fn() 2022-11-23T02:58:20.2188692Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2188813Z test(self, **param_kwargs) 2022-11-23T02:58:20.2189361Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2189491Z return func(*args, **kwargs) 2022-11-23T02:58:20.2189796Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2189918Z self.run_subtests( 2022-11-23T02:58:20.2190262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2190425Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2190798Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2190953Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2191336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2191459Z output = model(*input) 2022-11-23T02:58:20.2191786Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2191930Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2192297Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2192480Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2192850Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2193037Z _lazy_init(state, module) 2022-11-23T02:58:20.2193412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2193558Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2193901Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2194026Z return func(*args, **kwargs) 2022-11-23T02:58:20.2194395Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2194500Z p_assert( 2022-11-23T02:58:20.2194841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2195026Z traceback.print_stack() 2022-11-23T02:58:20.2195158Z File "", line 1, in 2022-11-23T02:58:20.2195373Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2195522Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2195727Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2195862Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2196078Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2196181Z self.run() 2022-11-23T02:58:20.2196388Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2196536Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2196892Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2197031Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2197382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2197506Z getattr(self, test_name)() 2022-11-23T02:58:20.2197872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2197969Z fn() 2022-11-23T02:58:20.2198339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2198462Z test(self, **param_kwargs) 2022-11-23T02:58:20.2198825Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2198949Z return func(*args, **kwargs) 2022-11-23T02:58:20.2199233Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2199352Z self.run_subtests( 2022-11-23T02:58:20.2199706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2199872Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2200244Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2200398Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2200783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2200905Z output = model(*input) 2022-11-23T02:58:20.2201216Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2201358Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2201738Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2201921Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2202338Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2202465Z _lazy_init(state, module) 2022-11-23T02:58:20.2202826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2202973Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2203318Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2203425Z return func(*args, **kwargs) 2022-11-23T02:58:20.2203809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2203911Z p_assert( 2022-11-23T02:58:20.2204322Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2204452Z traceback.print_stack() 2022-11-23T02:58:20.2204586Z File "", line 1, in 2022-11-23T02:58:20.2204801Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2204927Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2205132Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2205281Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2205500Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2205602Z self.run() 2022-11-23T02:58:20.2205808Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2205956Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2206302Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2206422Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2206788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2206915Z getattr(self, test_name)() 2022-11-23T02:58:20.2207279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2207375Z fn() 2022-11-23T02:58:20.2207744Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2207868Z test(self, **param_kwargs) 2022-11-23T02:58:20.2208227Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2208336Z return func(*args, **kwargs) 2022-11-23T02:58:20.2208636Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2208751Z self.run_subtests( 2022-11-23T02:58:20.2209106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2209273Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2209640Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2209798Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2210180Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2210282Z output = model(*input) 2022-11-23T02:58:20.2210615Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2210756Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2211143Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2211320Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2211739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2211868Z _lazy_init(state, module) 2022-11-23T02:58:20.2212229Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2212356Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2212697Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2212821Z return func(*args, **kwargs) 2022-11-23T02:58:20.2213204Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2213350Z p_assert( 2022-11-23T02:58:20.2213696Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2213824Z traceback.print_stack() 2022-11-23T02:58:20.2213952Z File "", line 1, in 2022-11-23T02:58:20.2214150Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2214294Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2214499Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2214649Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2214865Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2214968Z self.run() 2022-11-23T02:58:20.2215173Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2215303Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2215652Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2215784Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2216151Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2216276Z getattr(self, test_name)() 2022-11-23T02:58:20.2216639Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2216736Z fn() 2022-11-23T02:58:20.2217103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2217210Z test(self, **param_kwargs) 2022-11-23T02:58:20.2217571Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2217695Z return func(*args, **kwargs) 2022-11-23T02:58:20.2218001Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2218112Z self.run_subtests( 2022-11-23T02:58:20.2218470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2218634Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2219006Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2219143Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2219524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2219643Z output = model(*input) 2022-11-23T02:58:20.2219974Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2220120Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2220499Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2220675Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2221092Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2221200Z _lazy_init(state, module) 2022-11-23T02:58:20.2221562Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2221705Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2222047Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2222175Z return func(*args, **kwargs) 2022-11-23T02:58:20.2222562Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2222711Z p_assert( 2022-11-23T02:58:20.2223057Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2223168Z traceback.print_stack() 2022-11-23T02:58:20.2223301Z File "", line 1, in 2022-11-23T02:58:20.2223515Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2223661Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2223865Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2224016Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2224231Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2224334Z self.run() 2022-11-23T02:58:20.2224522Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2224671Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2225015Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2225147Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2225520Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2225645Z getattr(self, test_name)() 2022-11-23T02:58:20.2226006Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2226086Z fn() 2022-11-23T02:58:20.2226455Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2226576Z test(self, **param_kwargs) 2022-11-23T02:58:20.2226940Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2227069Z return func(*args, **kwargs) 2022-11-23T02:58:20.2227370Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2227483Z self.run_subtests( 2022-11-23T02:58:20.2227841Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2227988Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2228357Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2228508Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2228885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2229233Z output = model(*input) 2022-11-23T02:58:20.2229587Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2229736Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2230119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2230366Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2230735Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2230858Z _lazy_init(state, module) 2022-11-23T02:58:20.2231213Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2231357Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2231702Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2231830Z return func(*args, **kwargs) 2022-11-23T02:58:20.2232214Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2232380Z p_assert( 2022-11-23T02:58:20.2232706Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2232840Z traceback.print_stack() 2022-11-23T02:58:20.2232971Z File "", line 1, in 2022-11-23T02:58:20.2233184Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2233324Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2233529Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2233679Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2233877Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2233979Z self.run() 2022-11-23T02:58:20.2234182Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2234334Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2234678Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2234809Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2235176Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2235303Z getattr(self, test_name)() 2022-11-23T02:58:20.2235648Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2235744Z fn() 2022-11-23T02:58:20.2236114Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2236234Z test(self, **param_kwargs) 2022-11-23T02:58:20.2236596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2236726Z return func(*args, **kwargs) 2022-11-23T02:58:20.2237023Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2237136Z self.run_subtests( 2022-11-23T02:58:20.2237480Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2237647Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2238013Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2238166Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2238546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2238665Z output = model(*input) 2022-11-23T02:58:20.2238992Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2239135Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2239504Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2239745Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2240129Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2240254Z _lazy_init(state, module) 2022-11-23T02:58:20.2240614Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2240757Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2241100Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2241224Z return func(*args, **kwargs) 2022-11-23T02:58:20.2241651Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2241755Z p_assert( 2022-11-23T02:58:20.2242098Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2242223Z traceback.print_stack() 2022-11-23T02:58:20.2242350Z File "", line 1, in 2022-11-23T02:58:20.2242563Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2242705Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2242905Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2243044Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2243257Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2243359Z self.run() 2022-11-23T02:58:20.2243563Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2243714Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2244059Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2244193Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2244543Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2244670Z getattr(self, test_name)() 2022-11-23T02:58:20.2245037Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2245136Z fn() 2022-11-23T02:58:20.2245503Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2245626Z test(self, **param_kwargs) 2022-11-23T02:58:20.2245983Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2246113Z return func(*args, **kwargs) 2022-11-23T02:58:20.2246394Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2246511Z self.run_subtests( 2022-11-23T02:58:20.2246872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2247036Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2247399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2247552Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2247930Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2248049Z output = model(*input) 2022-11-23T02:58:20.2248367Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2248510Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2248933Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2249116Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2249494Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2249615Z _lazy_init(state, module) 2022-11-23T02:58:20.2249970Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2250114Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2250478Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2250661Z return func(*args, **kwargs) 2022-11-23T02:58:20.2251051Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2251155Z p_assert( 2022-11-23T02:58:20.2251501Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2251628Z traceback.print_stack() 2022-11-23T02:58:20.2251757Z File "", line 1, in 2022-11-23T02:58:20.2251966Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2252091Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2252297Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2252448Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2252664Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2252770Z self.run() 2022-11-23T02:58:20.2252972Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2253118Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2253469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2253586Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2253951Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2254073Z getattr(self, test_name)() 2022-11-23T02:58:20.2254435Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2254533Z fn() 2022-11-23T02:58:20.2254900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2255024Z test(self, **param_kwargs) 2022-11-23T02:58:20.2255371Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2255498Z return func(*args, **kwargs) 2022-11-23T02:58:20.2255802Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2255916Z self.run_subtests( 2022-11-23T02:58:20.2256272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2256433Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2256799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2256950Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2257328Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2257435Z output = model(*input) 2022-11-23T02:58:20.2257765Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2257906Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2258338Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2258524Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2258899Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2259022Z _lazy_init(state, module) 2022-11-23T02:58:20.2259379Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2259506Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2259851Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2260027Z return func(*args, **kwargs) 2022-11-23T02:58:20.2260416Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2260523Z p_assert( 2022-11-23T02:58:20.2260870Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2260996Z traceback.print_stack() 2022-11-23T02:58:20.2261109Z File "", line 1, in 2022-11-23T02:58:20.2261321Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2261463Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2261665Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2261816Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2262030Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2262135Z self.run() 2022-11-23T02:58:20.2262343Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2262473Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2262823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2262957Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2263325Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2263450Z getattr(self, test_name)() 2022-11-23T02:58:20.2263813Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2263909Z fn() 2022-11-23T02:58:20.2264280Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2264389Z test(self, **param_kwargs) 2022-11-23T02:58:20.2264750Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2264875Z return func(*args, **kwargs) 2022-11-23T02:58:20.2265182Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2265296Z self.run_subtests( 2022-11-23T02:58:20.2265653Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2265814Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2266184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2266321Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2266703Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2266828Z output = model(*input) 2022-11-23T02:58:20.2267163Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2267306Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2267730Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2267918Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2268295Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2268399Z _lazy_init(state, module) 2022-11-23T02:58:20.2268758Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2268902Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2269487Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2269704Z return func(*args, **kwargs) 2022-11-23T02:58:20.2270095Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2270202Z p_assert( 2022-11-23T02:58:20.2270548Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2270658Z traceback.print_stack() 2022-11-23T02:58:20.2270788Z File "", line 1, in 2022-11-23T02:58:20.2271000Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2271143Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2271348Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2271502Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2271718Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2271809Z self.run() 2022-11-23T02:58:20.2272016Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2272162Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2272511Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2272646Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2273012Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2273139Z getattr(self, test_name)() 2022-11-23T02:58:20.2273503Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2273583Z fn() 2022-11-23T02:58:20.2273949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2274077Z test(self, **param_kwargs) 2022-11-23T02:58:20.2274438Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2274566Z return func(*args, **kwargs) 2022-11-23T02:58:20.2274867Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2274981Z self.run_subtests( 2022-11-23T02:58:20.2275335Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2275482Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2275853Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2276006Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2276390Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2276514Z output = model(*input) 2022-11-23T02:58:20.2276845Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2277046Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2277444Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2277605Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2277976Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2278095Z _lazy_init(state, module) 2022-11-23T02:58:20.2278451Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2278593Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2278992Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2279119Z return func(*args, **kwargs) 2022-11-23T02:58:20.2279504Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2279590Z p_assert( 2022-11-23T02:58:20.2279937Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2280063Z traceback.print_stack() 2022-11-23T02:58:20.2280193Z File "", line 1, in 2022-11-23T02:58:20.2280404Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2280545Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2280746Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2280895Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2281097Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2281200Z self.run() 2022-11-23T02:58:20.2281404Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2281551Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2281900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2282037Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2282404Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2282526Z getattr(self, test_name)() 2022-11-23T02:58:20.2282875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2282974Z fn() 2022-11-23T02:58:20.2283343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2283470Z test(self, **param_kwargs) 2022-11-23T02:58:20.2283827Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2283955Z return func(*args, **kwargs) 2022-11-23T02:58:20.2284255Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2284367Z self.run_subtests( 2022-11-23T02:58:20.2284710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2284872Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2285238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2285390Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2285775Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2285892Z output = model(*input) 2022-11-23T02:58:20.2286266Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2286419Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2286790Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2286972Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2287346Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2287464Z _lazy_init(state, module) 2022-11-23T02:58:20.2287820Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2288012Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2288356Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2288482Z return func(*args, **kwargs) 2022-11-23T02:58:20.2288852Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2288956Z p_assert( 2022-11-23T02:58:20.2289296Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2289424Z traceback.print_stack() 2022-11-23T02:58:20.2289553Z File "", line 1, in 2022-11-23T02:58:20.2289764Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2289896Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2290083Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2290245Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2290461Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2290563Z self.run() 2022-11-23T02:58:20.2290767Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2290918Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2291263Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2291397Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2291745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2291869Z getattr(self, test_name)() 2022-11-23T02:58:20.2292233Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2292330Z fn() 2022-11-23T02:58:20.2292702Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2292826Z test(self, **param_kwargs) 2022-11-23T02:58:20.2293186Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2293298Z return func(*args, **kwargs) 2022-11-23T02:58:20.2293600Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2293713Z self.run_subtests( 2022-11-23T02:58:20.2294068Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2294230Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2294599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2294758Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2295141Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2295262Z output = model(*input) 2022-11-23T02:58:20.2295621Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2295770Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2296157Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2296336Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2296706Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2296829Z _lazy_init(state, module) 2022-11-23T02:58:20.2297184Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2297375Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2297705Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2297833Z return func(*args, **kwargs) 2022-11-23T02:58:20.2298217Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2298322Z p_assert( 2022-11-23T02:58:20.2298665Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2298793Z traceback.print_stack() 2022-11-23T02:58:20.2298922Z File "", line 1, in 2022-11-23T02:58:20.2299116Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2299259Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2299461Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2299618Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2299832Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2299937Z self.run() 2022-11-23T02:58:20.2300145Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2300292Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2300619Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2300753Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2301120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2301241Z getattr(self, test_name)() 2022-11-23T02:58:20.2301603Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2301702Z fn() 2022-11-23T02:58:20.2302072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2302195Z test(self, **param_kwargs) 2022-11-23T02:58:20.2302542Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2302669Z return func(*args, **kwargs) 2022-11-23T02:58:20.2302968Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2303081Z self.run_subtests( 2022-11-23T02:58:20.2303433Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2303597Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2303964Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2304120Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2304486Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2304604Z output = model(*input) 2022-11-23T02:58:20.2304980Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2305128Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2305510Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2305684Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2306054Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2306176Z _lazy_init(state, module) 2022-11-23T02:58:20.2306513Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2306704Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2307052Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2307180Z return func(*args, **kwargs) 2022-11-23T02:58:20.2307564Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2307667Z p_assert( 2022-11-23T02:58:20.2308009Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2308139Z traceback.print_stack() 2022-11-23T02:58:20.2308251Z File "", line 1, in 2022-11-23T02:58:20.2308464Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2308606Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2308815Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2309195Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2309427Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2309531Z self.run() 2022-11-23T02:58:20.2309725Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2309873Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2310228Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2310363Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2310731Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2310855Z getattr(self, test_name)() 2022-11-23T02:58:20.2311218Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2311319Z fn() 2022-11-23T02:58:20.2311671Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2311793Z test(self, **param_kwargs) 2022-11-23T02:58:20.2312156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2312281Z return func(*args, **kwargs) 2022-11-23T02:58:20.2312584Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2312699Z self.run_subtests( 2022-11-23T02:58:20.2313058Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2313222Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2313576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2313735Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2314116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2314306Z output = model(*input) 2022-11-23T02:58:20.2314651Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2314793Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2315175Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2315353Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2315710Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2315832Z _lazy_init(state, module) 2022-11-23T02:58:20.2316254Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2316397Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2316742Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2316871Z return func(*args, **kwargs) 2022-11-23T02:58:20.2317260Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2317363Z p_assert( 2022-11-23T02:58:20.2317689Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2317817Z traceback.print_stack() 2022-11-23T02:58:20.2317949Z File "", line 1, in 2022-11-23T02:58:20.2318161Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2318302Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2318511Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2318663Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2318876Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2318965Z self.run() 2022-11-23T02:58:20.2319169Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2319317Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2319663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2319797Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2320164Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2320288Z getattr(self, test_name)() 2022-11-23T02:58:20.2320651Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2320737Z fn() 2022-11-23T02:58:20.2321106Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2321232Z test(self, **param_kwargs) 2022-11-23T02:58:20.2321597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2321722Z return func(*args, **kwargs) 2022-11-23T02:58:20.2322023Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2322139Z self.run_subtests( 2022-11-23T02:58:20.2322479Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2322643Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2323021Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2323175Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2323602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2323730Z output = model(*input) 2022-11-23T02:58:20.2324064Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2324207Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2324590Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2324750Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2325126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2325308Z _lazy_init(state, module) 2022-11-23T02:58:20.2325670Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2325816Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2326161Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2326289Z return func(*args, **kwargs) 2022-11-23T02:58:20.2326674Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2326759Z p_assert( 2022-11-23T02:58:20.2327098Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2327224Z traceback.print_stack() 2022-11-23T02:58:20.2327351Z File "", line 1, in 2022-11-23T02:58:20.2327563Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2327711Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2327916Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2328050Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2328265Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2328367Z self.run() 2022-11-23T02:58:20.2328568Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2328716Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2329064Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2329198Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2329563Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2329669Z getattr(self, test_name)() 2022-11-23T02:58:20.2330041Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2330138Z fn() 2022-11-23T02:58:20.2330508Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2330634Z test(self, **param_kwargs) 2022-11-23T02:58:20.2330996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2331121Z return func(*args, **kwargs) 2022-11-23T02:58:20.2331422Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2331519Z self.run_subtests( 2022-11-23T02:58:20.2331875Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2332038Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2332409Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2332561Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2332989Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2333115Z output = model(*input) 2022-11-23T02:58:20.2333450Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2333574Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2333959Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2334136Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2334505Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2334691Z _lazy_init(state, module) 2022-11-23T02:58:20.2335052Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2335195Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2335541Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2335650Z return func(*args, **kwargs) 2022-11-23T02:58:20.2336037Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2336139Z p_assert( 2022-11-23T02:58:20.2336479Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2336606Z traceback.print_stack() 2022-11-23T02:58:20.2336736Z File "", line 1, in 2022-11-23T02:58:20.2336949Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2337099Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2337285Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2337437Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2337655Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2337759Z self.run() 2022-11-23T02:58:20.2337963Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2338109Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2338453Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2338570Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2338936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2339064Z getattr(self, test_name)() 2022-11-23T02:58:20.2339429Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2339527Z fn() 2022-11-23T02:58:20.2339898Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2340021Z test(self, **param_kwargs) 2022-11-23T02:58:20.2340382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2340491Z return func(*args, **kwargs) 2022-11-23T02:58:20.2340792Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2340906Z self.run_subtests( 2022-11-23T02:58:20.2341260Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2341424Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2341795Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2341952Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2342379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2342488Z output = model(*input) 2022-11-23T02:58:20.2342821Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2342960Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2343341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2343519Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2343894Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2344068Z _lazy_init(state, module) 2022-11-23T02:58:20.2344428Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2344558Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2344900Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2345026Z return func(*args, **kwargs) 2022-11-23T02:58:20.2345414Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2345518Z p_assert( 2022-11-23T02:58:20.2345860Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2345987Z traceback.print_stack() 2022-11-23T02:58:20.2346118Z File "", line 1, in 2022-11-23T02:58:20.2346317Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2346460Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2346664Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2346818Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2347037Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2347141Z self.run() 2022-11-23T02:58:20.2347345Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2347492Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2347820Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2347956Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2348321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2348449Z getattr(self, test_name)() 2022-11-23T02:58:20.2348811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2348911Z fn() 2022-11-23T02:58:20.2349475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2349583Z test(self, **param_kwargs) 2022-11-23T02:58:20.2349950Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2350076Z return func(*args, **kwargs) 2022-11-23T02:58:20.2350378Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2350523Z self.run_subtests( 2022-11-23T02:58:20.2350884Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2351054Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2351424Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2351645Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2352023Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2352142Z output = model(*input) 2022-11-23T02:58:20.2352473Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2352616Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2353001Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2353179Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2353624Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2353744Z _lazy_init(state, module) 2022-11-23T02:58:20.2354086Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2354234Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2354575Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2354701Z return func(*args, **kwargs) 2022-11-23T02:58:20.2355086Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2355190Z p_assert( 2022-11-23T02:58:20.2355532Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2355660Z traceback.print_stack() 2022-11-23T02:58:20.2355776Z File "", line 1, in 2022-11-23T02:58:20.2355988Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2356131Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2356339Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2356491Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2356704Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2356807Z self.run() 2022-11-23T02:58:20.2356995Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2357140Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2357487Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2357620Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2357984Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2358110Z getattr(self, test_name)() 2022-11-23T02:58:20.2358478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2358575Z fn() 2022-11-23T02:58:20.2358932Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2359056Z test(self, **param_kwargs) 2022-11-23T02:58:20.2359414Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2359541Z return func(*args, **kwargs) 2022-11-23T02:58:20.2359840Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2359955Z self.run_subtests( 2022-11-23T02:58:20.2360311Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2360477Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2360872Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2361036Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2361416Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2361537Z output = model(*input) 2022-11-23T02:58:20.2361868Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2362012Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2362392Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2362570Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2362975Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2363098Z _lazy_init(state, module) 2022-11-23T02:58:20.2363458Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2363602Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2363943Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2364068Z return func(*args, **kwargs) 2022-11-23T02:58:20.2364448Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2364551Z p_assert( 2022-11-23T02:58:20.2364874Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2365007Z traceback.print_stack() 2022-11-23T02:58:20.2365136Z File "", line 1, in 2022-11-23T02:58:20.2365348Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2365489Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2365695Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2365849Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2366066Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2366153Z self.run() 2022-11-23T02:58:20.2366356Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2366503Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2366845Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2366977Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2367347Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2367474Z getattr(self, test_name)() 2022-11-23T02:58:20.2367822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2367923Z fn() 2022-11-23T02:58:20.2368293Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2368417Z test(self, **param_kwargs) 2022-11-23T02:58:20.2368778Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2368904Z return func(*args, **kwargs) 2022-11-23T02:58:20.2369204Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2369319Z self.run_subtests( 2022-11-23T02:58:20.2369663Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2369824Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2370243Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2370404Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2370787Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2370906Z output = model(*input) 2022-11-23T02:58:20.2371237Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2371380Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2371742Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2371970Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2372346Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2372467Z _lazy_init(state, module) 2022-11-23T02:58:20.2372828Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2372972Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2373312Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2373437Z return func(*args, **kwargs) 2022-11-23T02:58:20.2373803Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2373904Z p_assert( 2022-11-23T02:58:20.2374243Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2374375Z traceback.print_stack() 2022-11-23T02:58:20.2374504Z File "", line 1, in 2022-11-23T02:58:20.2374714Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2374865Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2375065Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2375199Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2375415Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2375517Z self.run() 2022-11-23T02:58:20.2375721Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2375864Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2376209Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2376351Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2376716Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2376822Z getattr(self, test_name)() 2022-11-23T02:58:20.2377187Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2377285Z fn() 2022-11-23T02:58:20.2377654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2377776Z test(self, **param_kwargs) 2022-11-23T02:58:20.2378136Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2378262Z return func(*args, **kwargs) 2022-11-23T02:58:20.2378562Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2378662Z self.run_subtests( 2022-11-23T02:58:20.2379017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2379177Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2379596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2379756Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2380142Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2380262Z output = model(*input) 2022-11-23T02:58:20.2380591Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2380718Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2381099Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2381367Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2381739Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2381860Z _lazy_init(state, module) 2022-11-23T02:58:20.2382218Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2382362Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2382702Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2382811Z return func(*args, **kwargs) 2022-11-23T02:58:20.2383192Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2383295Z p_assert( 2022-11-23T02:58:20.2383639Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2383768Z traceback.print_stack() 2022-11-23T02:58:20.2383896Z File "", line 1, in 2022-11-23T02:58:20.2384106Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2384234Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2384440Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2384589Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2384803Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2384906Z self.run() 2022-11-23T02:58:20.2385109Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2385254Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2385598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2385718Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2386086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2386207Z getattr(self, test_name)() 2022-11-23T02:58:20.2386570Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2386662Z fn() 2022-11-23T02:58:20.2387033Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2387156Z test(self, **param_kwargs) 2022-11-23T02:58:20.2387514Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2387623Z return func(*args, **kwargs) 2022-11-23T02:58:20.2387923Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 128, in test_nested_wrapped_model_single_iteration_mixed_precision 2022-11-23T02:58:20.2388038Z self.run_subtests( 2022-11-23T02:58:20.2388391Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2388601Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2389214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2389381Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2389763Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2389866Z output = model(*input) 2022-11-23T02:58:20.2390194Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2390330Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2390709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2390960Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2391336Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2391458Z _lazy_init(state, module) 2022-11-23T02:58:20.2391810Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2391937Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2392276Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2392399Z return func(*args, **kwargs) 2022-11-23T02:58:20.2392779Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2392879Z p_assert( 2022-11-23T02:58:20.2393223Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2393348Z traceback.print_stack() 2022-11-23T02:58:20.2393458Z dist init r=1, world=2 2022-11-23T02:58:20.2393784Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2394105Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2394419Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2394727Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2395042Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2395354Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2395658Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2395958Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2396263Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2396588Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2396904Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2397279Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2397381Z dist init r=0, world=2 2022-11-23T02:58:20.2397711Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2398028Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2398337Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2398702Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2399009Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2399314Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2399615Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2399920Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2400230Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2400537Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2400827Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2401131Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2401229Z ok (5.814s) 2022-11-23T02:58:20.2401570Z test_transformer_offload_false_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92257 2022-11-23T02:58:20.2401794Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92258 2022-11-23T02:58:20.2402178Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.2402353Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.2402737Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.2402926Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.2403283Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.2403455Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.2403836Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.2404032Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.2404274Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.2404563Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.2404979Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.2405382Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.2405613Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.2405827Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.2406063Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2406349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2407381Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.2407492Z warnings.warn( 2022-11-23T02:58:20.2408518Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.2408630Z warnings.warn( 2022-11-23T02:58:20.2408859Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2409094Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2409324Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2409557Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2409772Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2409999Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2410220Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2410442Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2410670Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2410898Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2411126Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2411355Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2411567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2411792Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2412017Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2412242Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2412471Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2412697Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2412921Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2413197Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2413435Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2413647Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2413873Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2414102Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2414325Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2414603Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2414830Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2415060Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2415284Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2415497Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2415719Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2415940Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2416164Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2416389Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2416619Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2416845Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2417074Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2417286Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2417513Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2417738Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2417963Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2418191Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2418419Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2418650Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2418875Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2419101Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2419196Z dist init r=1, world=2 2022-11-23T02:58:20.2419300Z dist init r=0, world=2 2022-11-23T02:58:20.2419398Z ok (8.919s) 2022-11-23T02:58:20.2419744Z test_transformer_offload_false_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92340 2022-11-23T02:58:20.2419968Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92341 2022-11-23T02:58:20.2420357Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.2420535Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.2420921Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.2421097Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.2421521Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.2421706Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.2422091Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.2422280Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.2422524Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.2422772Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.2423226Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.2423615Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.2423847Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.2424076Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.2424310Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2424541Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2425564Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.2425684Z warnings.warn( 2022-11-23T02:58:20.2426724Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.2426836Z warnings.warn( 2022-11-23T02:58:20.2427069Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2427306Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2427531Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2427764Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2427999Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2428230Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2428459Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2428688Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2428915Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2429318Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2429554Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2429769Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2430068Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2430303Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2430529Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2430758Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2430986Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2431210Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2431432Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2431700Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2431928Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2432157Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2432384Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2432608Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2432832Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2433058Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2433283Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2433507Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2433725Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2433945Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2434171Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2434391Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2434612Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2434830Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2435051Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2435277Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2435487Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2435717Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2435942Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2436168Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2436389Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2436610Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2436829Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2437050Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2437259Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2437485Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2437593Z dist init r=0, world=2 2022-11-23T02:58:20.2437696Z dist init r=1, world=2 2022-11-23T02:58:20.2437794Z ok (8.918s) 2022-11-23T02:58:20.2438194Z test_transformer_offload_false_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92423 2022-11-23T02:58:20.2438420Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92424 2022-11-23T02:58:20.2438807Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.2438970Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.2439353Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.2439543Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.2439966Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.2440139Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.2440520Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.2440707Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.2440950Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.2441191Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.2441580Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.2441979Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.2442214Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.2442440Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.2442673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2442908Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2443936Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.2444050Z warnings.warn( 2022-11-23T02:58:20.2445077Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.2445188Z warnings.warn( 2022-11-23T02:58:20.2445422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2445642Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2445870Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2446099Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2446327Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2446559Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2446829Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2447067Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2447295Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2447509Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2447736Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2447961Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2448187Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2448463Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2448687Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2448916Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2449140Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2449364Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2449575Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2449799Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2450017Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2450248Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2450510Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2450733Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2450958Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2451182Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2451393Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2451618Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2451842Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2452065Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2452287Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2452508Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2452735Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2452963Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2453173Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2453397Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2453613Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2453836Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2454054Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2454281Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2454505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2454780Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2455013Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2455224Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2455445Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2455673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2455785Z dist init r=1, world=2 2022-11-23T02:58:20.2455886Z dist init r=0, world=2 2022-11-23T02:58:20.2455981Z ok (9.121s) 2022-11-23T02:58:20.2456380Z test_transformer_offload_true_no_shard_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92506 2022-11-23T02:58:20.2456598Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92507 2022-11-23T02:58:20.2456971Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.2457144Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.2457528Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.2457766Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.2458135Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.2458308Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.2458694Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.2458883Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.2459118Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.2459362Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.2459762Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.2460160Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.2460390Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.2460618Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.2460857Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2461090Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2462119Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.2462234Z warnings.warn( 2022-11-23T02:58:20.2463263Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.2463375Z warnings.warn( 2022-11-23T02:58:20.2463538Z File "", line 1, in 2022-11-23T02:58:20.2463759Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2463901Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2464101Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2464246Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2464457Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2464556Z self.run() 2022-11-23T02:58:20.2464744Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2464934Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2465280Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2465413Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2465782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2465904Z getattr(self, test_name)() 2022-11-23T02:58:20.2466265Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2466358Z fn() 2022-11-23T02:58:20.2466712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2466835Z test(self, **param_kwargs) 2022-11-23T02:58:20.2467195Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2467324Z return func(*args, **kwargs) 2022-11-23T02:58:20.2467564Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2467674Z self.run_subtests( 2022-11-23T02:58:20.2468033Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2468197Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2468552Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2468705Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2469314Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2469437Z output = model(*input) 2022-11-23T02:58:20.2469772Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2469920Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2470304Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2470480Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2470841Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2470961Z _lazy_init(state, module) 2022-11-23T02:58:20.2471314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2471454Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2471792Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2471911Z return func(*args, **kwargs) 2022-11-23T02:58:20.2472285Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2472393Z p_assert( 2022-11-23T02:58:20.2472718Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2472841Z traceback.print_stack() 2022-11-23T02:58:20.2473040Z File "", line 1, in 2022-11-23T02:58:20.2473256Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2473391Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2473593Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2473743Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2473943Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2474045Z self.run() 2022-11-23T02:58:20.2474245Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2474456Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2474800Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2474931Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2475302Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2475419Z getattr(self, test_name)() 2022-11-23T02:58:20.2475773Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2475871Z fn() 2022-11-23T02:58:20.2476232Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2476350Z test(self, **param_kwargs) 2022-11-23T02:58:20.2476707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2476833Z return func(*args, **kwargs) 2022-11-23T02:58:20.2477071Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2477183Z self.run_subtests( 2022-11-23T02:58:20.2477527Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2477689Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2478053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2478201Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2478578Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2478694Z output = model(*input) 2022-11-23T02:58:20.2479019Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2479161Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2479531Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2479705Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2480075Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2480190Z _lazy_init(state, module) 2022-11-23T02:58:20.2480548Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2480688Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2481028Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2481151Z return func(*args, **kwargs) 2022-11-23T02:58:20.2481518Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2481625Z p_assert( 2022-11-23T02:58:20.2481962Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2482083Z traceback.print_stack() 2022-11-23T02:58:20.2482370Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2482612Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2482740Z File "", line 1, in 2022-11-23T02:58:20.2482946Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2483072Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2483273Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2483424Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2483636Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2483783Z self.run() 2022-11-23T02:58:20.2483983Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2484131Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2484470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2484604Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2484965Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2485085Z getattr(self, test_name)() 2022-11-23T02:58:20.2485442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2485533Z fn() 2022-11-23T02:58:20.2485899Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2486023Z test(self, **param_kwargs) 2022-11-23T02:58:20.2486369Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2486494Z return func(*args, **kwargs) 2022-11-23T02:58:20.2486737Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2486845Z self.run_subtests( 2022-11-23T02:58:20.2487199Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2487363Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2487726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2487879Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2488243Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2488359Z output = model(*input) 2022-11-23T02:58:20.2488682Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2488821Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2489198Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2489375Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2489741Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2489856Z _lazy_init(state, module) 2022-11-23T02:58:20.2490196Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2490336Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2490685Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2490808Z return func(*args, **kwargs) 2022-11-23T02:58:20.2491187Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2491333Z p_assert( 2022-11-23T02:58:20.2491683Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2491804Z traceback.print_stack() 2022-11-23T02:58:20.2491917Z File "", line 1, in 2022-11-23T02:58:20.2492125Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2492264Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2492462Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2492607Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2492867Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2492970Z self.run() 2022-11-23T02:58:20.2493170Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2493301Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2493650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2493784Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2494153Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2494277Z getattr(self, test_name)() 2022-11-23T02:58:20.2494640Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2494732Z fn() 2022-11-23T02:58:20.2495084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2495204Z test(self, **param_kwargs) 2022-11-23T02:58:20.2495564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2495686Z return func(*args, **kwargs) 2022-11-23T02:58:20.2495932Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2496039Z self.run_subtests( 2022-11-23T02:58:20.2496395Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2496554Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2496907Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2497058Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2497439Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2497563Z output = model(*input) 2022-11-23T02:58:20.2497890Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2498028Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2498409Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2498585Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2498954Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2499059Z _lazy_init(state, module) 2022-11-23T02:58:20.2499412Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2499552Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2499894Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2500017Z return func(*args, **kwargs) 2022-11-23T02:58:20.2500457Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2500567Z p_assert( 2022-11-23T02:58:20.2500914Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2501023Z traceback.print_stack() 2022-11-23T02:58:20.2501266Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2501501Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2501628Z File "", line 1, in 2022-11-23T02:58:20.2501841Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2501982Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2502235Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2502370Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2502582Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2502687Z self.run() 2022-11-23T02:58:20.2502887Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2503029Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2503373Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2503501Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2503862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2503970Z getattr(self, test_name)() 2022-11-23T02:58:20.2504335Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2504435Z fn() 2022-11-23T02:58:20.2504802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2504924Z test(self, **param_kwargs) 2022-11-23T02:58:20.2505282Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2505401Z return func(*args, **kwargs) 2022-11-23T02:58:20.2505642Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2505739Z self.run_subtests( 2022-11-23T02:58:20.2506091Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2506254Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2506622Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2506781Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2507162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2507281Z output = model(*input) 2022-11-23T02:58:20.2507605Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2507729Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2508107Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2508285Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2508653Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2508770Z _lazy_init(state, module) 2022-11-23T02:58:20.2509304Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2509451Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2509867Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2509982Z return func(*args, **kwargs) 2022-11-23T02:58:20.2510374Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2510473Z p_assert( 2022-11-23T02:58:20.2510812Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2510935Z traceback.print_stack() 2022-11-23T02:58:20.2511065Z File "", line 1, in 2022-11-23T02:58:20.2511277Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2511403Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2511667Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2511812Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2512030Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2512129Z self.run() 2022-11-23T02:58:20.2512330Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2512476Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2512823Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2512940Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2513308Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2513425Z getattr(self, test_name)() 2022-11-23T02:58:20.2513785Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2513879Z fn() 2022-11-23T02:58:20.2514247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2514370Z test(self, **param_kwargs) 2022-11-23T02:58:20.2514728Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2514837Z return func(*args, **kwargs) 2022-11-23T02:58:20.2515073Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2515181Z self.run_subtests( 2022-11-23T02:58:20.2515533Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2515691Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2516053Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2516205Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2516582Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2516690Z output = model(*input) 2022-11-23T02:58:20.2517018Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2517154Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2517533Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2517709Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2518079Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2518198Z _lazy_init(state, module) 2022-11-23T02:58:20.2518559Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2518685Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2519106Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2519234Z return func(*args, **kwargs) 2022-11-23T02:58:20.2519623Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2519725Z p_assert( 2022-11-23T02:58:20.2520066Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2520192Z traceback.print_stack() 2022-11-23T02:58:20.2520433Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2520655Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2520848Z File "", line 1, in 2022-11-23T02:58:20.2521062Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2521200Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2521398Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2521544Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2521757Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2521843Z self.run() 2022-11-23T02:58:20.2522043Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2522188Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2522537Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2522669Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2523036Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2523151Z getattr(self, test_name)() 2022-11-23T02:58:20.2523516Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2523597Z fn() 2022-11-23T02:58:20.2523968Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2524087Z test(self, **param_kwargs) 2022-11-23T02:58:20.2524447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2524569Z return func(*args, **kwargs) 2022-11-23T02:58:20.2524802Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2524911Z self.run_subtests( 2022-11-23T02:58:20.2525270Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2525420Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2525794Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2525943Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2526321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2526438Z output = model(*input) 2022-11-23T02:58:20.2526762Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2526900Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2527275Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2527436Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2527813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2527930Z _lazy_init(state, module) 2022-11-23T02:58:20.2528327Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2528476Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2528817Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2528942Z return func(*args, **kwargs) 2022-11-23T02:58:20.2529323Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2529409Z p_assert( 2022-11-23T02:58:20.2529747Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2529924Z traceback.print_stack() 2022-11-23T02:58:20.2530050Z File "", line 1, in 2022-11-23T02:58:20.2530258Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2530401Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2530606Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2530756Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2530956Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2531057Z self.run() 2022-11-23T02:58:20.2531256Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2531399Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2531747Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2531881Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2532245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2532369Z getattr(self, test_name)() 2022-11-23T02:58:20.2532718Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2532811Z fn() 2022-11-23T02:58:20.2533177Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2533299Z test(self, **param_kwargs) 2022-11-23T02:58:20.2533654Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2533779Z return func(*args, **kwargs) 2022-11-23T02:58:20.2534017Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2534113Z self.run_subtests( 2022-11-23T02:58:20.2534466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2534631Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2535003Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2535152Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2535528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2535645Z output = model(*input) 2022-11-23T02:58:20.2535975Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2536101Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2536482Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2536653Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2537021Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2537141Z _lazy_init(state, module) 2022-11-23T02:58:20.2537540Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2537687Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2538030Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2538151Z return func(*args, **kwargs) 2022-11-23T02:58:20.2538518Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2538619Z p_assert( 2022-11-23T02:58:20.2538957Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2539127Z traceback.print_stack() 2022-11-23T02:58:20.2539365Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2539604Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2539732Z File "", line 1, in 2022-11-23T02:58:20.2539929Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2540070Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2540268Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2540413Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2540628Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2540732Z self.run() 2022-11-23T02:58:20.2540934Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2541077Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2541415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2541546Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2541910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2542025Z getattr(self, test_name)() 2022-11-23T02:58:20.2542385Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2542483Z fn() 2022-11-23T02:58:20.2542849Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2542971Z test(self, **param_kwargs) 2022-11-23T02:58:20.2543315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2543440Z return func(*args, **kwargs) 2022-11-23T02:58:20.2543680Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2543791Z self.run_subtests( 2022-11-23T02:58:20.2544149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2544308Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2544670Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2544820Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2545183Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2545297Z output = model(*input) 2022-11-23T02:58:20.2545622Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2545764Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2546144Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2546320Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2546729Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2546857Z _lazy_init(state, module) 2022-11-23T02:58:20.2547204Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2547339Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2547677Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2547799Z return func(*args, **kwargs) 2022-11-23T02:58:20.2548177Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2548329Z p_assert( 2022-11-23T02:58:20.2548672Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2548796Z traceback.print_stack() 2022-11-23T02:58:20.2548914Z File "", line 1, in 2022-11-23T02:58:20.2549303Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2549447Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2549647Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2549797Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2550009Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2550109Z self.run() 2022-11-23T02:58:20.2550298Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2550475Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2550822Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2550953Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2551321Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2551438Z getattr(self, test_name)() 2022-11-23T02:58:20.2551806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2551900Z fn() 2022-11-23T02:58:20.2552252Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2552373Z test(self, **param_kwargs) 2022-11-23T02:58:20.2552730Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2552850Z return func(*args, **kwargs) 2022-11-23T02:58:20.2553089Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2553197Z self.run_subtests( 2022-11-23T02:58:20.2553550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2553712Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2554066Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2554217Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2554596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2554710Z output = model(*input) 2022-11-23T02:58:20.2555037Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2555182Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2555567Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2555744Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2556176Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2556301Z _lazy_init(state, module) 2022-11-23T02:58:20.2556662Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2556804Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2557143Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2557263Z return func(*args, **kwargs) 2022-11-23T02:58:20.2557642Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2557806Z p_assert( 2022-11-23T02:58:20.2558137Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2558264Z traceback.print_stack() 2022-11-23T02:58:20.2558512Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2558751Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2558877Z File "", line 1, in 2022-11-23T02:58:20.2559082Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2559224Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2559426Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2559561Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2559773Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2559876Z self.run() 2022-11-23T02:58:20.2560076Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2560218Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2560564Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2560698Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2561047Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2561169Z getattr(self, test_name)() 2022-11-23T02:58:20.2561529Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2561619Z fn() 2022-11-23T02:58:20.2561982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2562105Z test(self, **param_kwargs) 2022-11-23T02:58:20.2562462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2562585Z return func(*args, **kwargs) 2022-11-23T02:58:20.2562814Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2562919Z self.run_subtests( 2022-11-23T02:58:20.2563271Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2563428Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2563788Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2563935Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2564312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2564430Z output = model(*input) 2022-11-23T02:58:20.2564742Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2564881Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2565308Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2565490Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2565859Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2565973Z _lazy_init(state, module) 2022-11-23T02:58:20.2566324Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2566460Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2566783Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2566959Z return func(*args, **kwargs) 2022-11-23T02:58:20.2567341Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2567441Z p_assert( 2022-11-23T02:58:20.2567781Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2567902Z traceback.print_stack() 2022-11-23T02:58:20.2568027Z File "", line 1, in 2022-11-23T02:58:20.2568237Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2568364Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2568564Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2568715Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2568926Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2569031Z self.run() 2022-11-23T02:58:20.2569234Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2569373Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2569704Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2569832Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2570194Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2570314Z getattr(self, test_name)() 2022-11-23T02:58:20.2570673Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2570765Z fn() 2022-11-23T02:58:20.2571129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2571250Z test(self, **param_kwargs) 2022-11-23T02:58:20.2571596Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2571718Z return func(*args, **kwargs) 2022-11-23T02:58:20.2571959Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2572069Z self.run_subtests( 2022-11-23T02:58:20.2572422Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2572579Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2572944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2573098Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2573460Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2573582Z output = model(*input) 2022-11-23T02:58:20.2573909Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2574044Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2574465Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2574649Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2575021Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2575140Z _lazy_init(state, module) 2022-11-23T02:58:20.2575479Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2575616Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2575953Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2576123Z return func(*args, **kwargs) 2022-11-23T02:58:20.2576505Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2576606Z p_assert( 2022-11-23T02:58:20.2576945Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2577070Z traceback.print_stack() 2022-11-23T02:58:20.2577295Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2577531Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2577654Z File "", line 1, in 2022-11-23T02:58:20.2577862Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2577999Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2578204Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2578354Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2578562Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2578652Z self.run() 2022-11-23T02:58:20.2578853Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2578996Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2579337Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2579466Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2579828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2579947Z getattr(self, test_name)() 2022-11-23T02:58:20.2580306Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2580390Z fn() 2022-11-23T02:58:20.2580757Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2580877Z test(self, **param_kwargs) 2022-11-23T02:58:20.2581238Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2581359Z return func(*args, **kwargs) 2022-11-23T02:58:20.2581597Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2581705Z self.run_subtests( 2022-11-23T02:58:20.2582046Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2582203Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2582565Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2582720Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2583096Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2583258Z output = model(*input) 2022-11-23T02:58:20.2583599Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2583737Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2584116Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2584276Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2584645Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2584760Z _lazy_init(state, module) 2022-11-23T02:58:20.2585180Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2585322Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2585661Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2585784Z return func(*args, **kwargs) 2022-11-23T02:58:20.2586168Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2586254Z p_assert( 2022-11-23T02:58:20.2586594Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2586717Z traceback.print_stack() 2022-11-23T02:58:20.2586841Z File "", line 1, in 2022-11-23T02:58:20.2587048Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2587187Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2587400Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2587534Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2587747Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2587849Z self.run() 2022-11-23T02:58:20.2588052Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2588197Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2588535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2588662Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2589254Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2589371Z getattr(self, test_name)() 2022-11-23T02:58:20.2589743Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2589844Z fn() 2022-11-23T02:58:20.2590217Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2590337Z test(self, **param_kwargs) 2022-11-23T02:58:20.2590701Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2590821Z return func(*args, **kwargs) 2022-11-23T02:58:20.2591059Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2591155Z self.run_subtests( 2022-11-23T02:58:20.2591510Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2591667Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2592037Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2592188Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2592568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2592751Z output = model(*input) 2022-11-23T02:58:20.2593092Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2593216Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2593597Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2593771Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2594139Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2594258Z _lazy_init(state, module) 2022-11-23T02:58:20.2594672Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2594812Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2595155Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2595265Z return func(*args, **kwargs) 2022-11-23T02:58:20.2595652Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2595752Z p_assert( 2022-11-23T02:58:20.2596088Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2596213Z traceback.print_stack() 2022-11-23T02:58:20.2596452Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2596688Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2596818Z File "", line 1, in 2022-11-23T02:58:20.2597013Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2597151Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2597360Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2597511Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2597723Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2597824Z self.run() 2022-11-23T02:58:20.2598024Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2598153Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2598501Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2598628Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2598992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2599116Z getattr(self, test_name)() 2022-11-23T02:58:20.2599477Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2599574Z fn() 2022-11-23T02:58:20.2599939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2600045Z test(self, **param_kwargs) 2022-11-23T02:58:20.2600405Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2600528Z return func(*args, **kwargs) 2022-11-23T02:58:20.2600768Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2600876Z self.run_subtests( 2022-11-23T02:58:20.2601231Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2601395Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2601763Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2601948Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2602336Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2602457Z output = model(*input) 2022-11-23T02:58:20.2602787Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2602927Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2603303Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2603477Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2603906Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2604011Z _lazy_init(state, module) 2022-11-23T02:58:20.2604366Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2604507Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2604843Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2604966Z return func(*args, **kwargs) 2022-11-23T02:58:20.2605346Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2605446Z p_assert( 2022-11-23T02:58:20.2605786Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2605896Z traceback.print_stack() 2022-11-23T02:58:20.2606027Z File "", line 1, in 2022-11-23T02:58:20.2606234Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2606371Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2606574Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2606724Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2606936Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2607034Z self.run() 2022-11-23T02:58:20.2607223Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2607364Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2607706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2607833Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2608200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2608319Z getattr(self, test_name)() 2022-11-23T02:58:20.2608676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2608761Z fn() 2022-11-23T02:58:20.2609129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2609249Z test(self, **param_kwargs) 2022-11-23T02:58:20.2609602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2609721Z return func(*args, **kwargs) 2022-11-23T02:58:20.2609960Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2610069Z self.run_subtests( 2022-11-23T02:58:20.2610427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2610577Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2610944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2611143Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2611534Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2611652Z output = model(*input) 2022-11-23T02:58:20.2611977Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2612116Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2612500Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2612661Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2613077Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2613199Z _lazy_init(state, module) 2022-11-23T02:58:20.2613555Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2613698Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2614035Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2614159Z return func(*args, **kwargs) 2022-11-23T02:58:20.2614540Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2614627Z p_assert( 2022-11-23T02:58:20.2614966Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2615095Z traceback.print_stack() 2022-11-23T02:58:20.2615333Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2615567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2615696Z File "", line 1, in 2022-11-23T02:58:20.2615907Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2616046Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2616232Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2616376Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2616591Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2616692Z self.run() 2022-11-23T02:58:20.2616891Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2617034Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2617381Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2617516Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2617870Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2617992Z getattr(self, test_name)() 2022-11-23T02:58:20.2618352Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2618444Z fn() 2022-11-23T02:58:20.2618810Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2618928Z test(self, **param_kwargs) 2022-11-23T02:58:20.2619284Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2619408Z return func(*args, **kwargs) 2022-11-23T02:58:20.2619640Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2619747Z self.run_subtests( 2022-11-23T02:58:20.2620149Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2620312Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2620680Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2620826Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2621198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2621314Z output = model(*input) 2022-11-23T02:58:20.2621628Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2621767Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2622202Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2622376Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2622750Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2622869Z _lazy_init(state, module) 2022-11-23T02:58:20.2623220Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2623362Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2623686Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2623809Z return func(*args, **kwargs) 2022-11-23T02:58:20.2624190Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2624297Z p_assert( 2022-11-23T02:58:20.2624637Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2624760Z traceback.print_stack() 2022-11-23T02:58:20.2624891Z File "", line 1, in 2022-11-23T02:58:20.2625087Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2625227Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2625426Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2625575Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2625789Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2625889Z self.run() 2022-11-23T02:58:20.2626087Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2626232Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2626568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2626697Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2627065Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2627184Z getattr(self, test_name)() 2022-11-23T02:58:20.2627545Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2627639Z fn() 2022-11-23T02:58:20.2628006Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2628128Z test(self, **param_kwargs) 2022-11-23T02:58:20.2628475Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2628598Z return func(*args, **kwargs) 2022-11-23T02:58:20.2628838Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2629120Z self.run_subtests( 2022-11-23T02:58:20.2629566Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2629731Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2630098Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2630253Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2630615Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2630728Z output = model(*input) 2022-11-23T02:58:20.2631052Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2631250Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2631631Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2631805Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2632179Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2632296Z _lazy_init(state, module) 2022-11-23T02:58:20.2632636Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2632775Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2633113Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2633239Z return func(*args, **kwargs) 2022-11-23T02:58:20.2633622Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2633724Z p_assert( 2022-11-23T02:58:20.2634066Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2634186Z traceback.print_stack() 2022-11-23T02:58:20.2634415Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2634647Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2634774Z File "", line 1, in 2022-11-23T02:58:20.2634985Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2635128Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2635327Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2635476Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2635672Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2635778Z self.run() 2022-11-23T02:58:20.2635977Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2636120Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2636468Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2636599Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2636962Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2637080Z getattr(self, test_name)() 2022-11-23T02:58:20.2637427Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2637526Z fn() 2022-11-23T02:58:20.2637889Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2638016Z test(self, **param_kwargs) 2022-11-23T02:58:20.2638377Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2638503Z return func(*args, **kwargs) 2022-11-23T02:58:20.2638789Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2638909Z self.run_subtests( 2022-11-23T02:58:20.2639254Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2639414Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2639779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2639931Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2640307Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2640473Z output = model(*input) 2022-11-23T02:58:20.2640802Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2640941Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2641313Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2641491Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2641863Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2641980Z _lazy_init(state, module) 2022-11-23T02:58:20.2642338Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2642477Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2642818Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2642941Z return func(*args, **kwargs) 2022-11-23T02:58:20.2643310Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2643412Z p_assert( 2022-11-23T02:58:20.2643748Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2643870Z traceback.print_stack() 2022-11-23T02:58:20.2643995Z File "", line 1, in 2022-11-23T02:58:20.2644205Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2644347Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2644551Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2644686Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2644899Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2645005Z self.run() 2022-11-23T02:58:20.2645201Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2645347Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2645696Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2645827Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2646179Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2646300Z getattr(self, test_name)() 2022-11-23T02:58:20.2646660Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2646753Z fn() 2022-11-23T02:58:20.2647120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2647245Z test(self, **param_kwargs) 2022-11-23T02:58:20.2647602Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2647724Z return func(*args, **kwargs) 2022-11-23T02:58:20.2648002Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2648120Z self.run_subtests( 2022-11-23T02:58:20.2648478Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2648636Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2648998Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2649147Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2649524Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2649699Z output = model(*input) 2022-11-23T02:58:20.2650015Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2650152Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2650580Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2650757Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2651128Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2651248Z _lazy_init(state, module) 2022-11-23T02:58:20.2651595Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2651731Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2652055Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2652181Z return func(*args, **kwargs) 2022-11-23T02:58:20.2652564Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2652665Z p_assert( 2022-11-23T02:58:20.2653004Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2653129Z traceback.print_stack() 2022-11-23T02:58:20.2653366Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2653601Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2653715Z File "", line 1, in 2022-11-23T02:58:20.2653922Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2654058Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2654262Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2654410Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2654619Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2654718Z self.run() 2022-11-23T02:58:20.2654920Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2655049Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2655401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2655531Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2655896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2656018Z getattr(self, test_name)() 2022-11-23T02:58:20.2656379Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2656476Z fn() 2022-11-23T02:58:20.2656832Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2656948Z test(self, **param_kwargs) 2022-11-23T02:58:20.2657358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2657489Z return func(*args, **kwargs) 2022-11-23T02:58:20.2657726Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2657837Z self.run_subtests( 2022-11-23T02:58:20.2658200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2658357Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2658707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2658950Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2659327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2659445Z output = model(*input) 2022-11-23T02:58:20.2659769Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2659905Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2660286Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2660463Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2660830Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2660937Z _lazy_init(state, module) 2022-11-23T02:58:20.2661293Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2661433Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2661772Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2661898Z return func(*args, **kwargs) 2022-11-23T02:58:20.2662280Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2662377Z p_assert( 2022-11-23T02:58:20.2662715Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2662824Z traceback.print_stack() 2022-11-23T02:58:20.2662951Z File "", line 1, in 2022-11-23T02:58:20.2663157Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2663298Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2663504Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2663654Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2663866Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2663956Z self.run() 2022-11-23T02:58:20.2664156Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2664298Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2664635Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2664765Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2665129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2665249Z getattr(self, test_name)() 2022-11-23T02:58:20.2665612Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2665696Z fn() 2022-11-23T02:58:20.2666063Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2666181Z test(self, **param_kwargs) 2022-11-23T02:58:20.2666591Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2666721Z return func(*args, **kwargs) 2022-11-23T02:58:20.2666959Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2667065Z self.run_subtests( 2022-11-23T02:58:20.2667419Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2667565Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2667931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2668132Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2668512Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2668632Z output = model(*input) 2022-11-23T02:58:20.2669173Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2669328Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2669714Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2669876Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2670246Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2670363Z _lazy_init(state, module) 2022-11-23T02:58:20.2670722Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2670862Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2676894Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2677058Z return func(*args, **kwargs) 2022-11-23T02:58:20.2677486Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2677574Z p_assert( 2022-11-23T02:58:20.2677921Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2678045Z traceback.print_stack() 2022-11-23T02:58:20.2678284Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2678523Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2678652Z File "", line 1, in 2022-11-23T02:58:20.2678865Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2679002Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2679195Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2679341Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2679554Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2679652Z self.run() 2022-11-23T02:58:20.2679855Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2679996Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2680343Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2680460Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2680828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2680953Z getattr(self, test_name)() 2022-11-23T02:58:20.2681324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2681534Z fn() 2022-11-23T02:58:20.2681926Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2682042Z test(self, **param_kwargs) 2022-11-23T02:58:20.2682401Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2682510Z return func(*args, **kwargs) 2022-11-23T02:58:20.2682746Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2682854Z self.run_subtests( 2022-11-23T02:58:20.2683210Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2683510Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2683885Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2684039Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2684415Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2684519Z output = model(*input) 2022-11-23T02:58:20.2684839Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2684972Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2685348Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2685524Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2685902Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2686018Z _lazy_init(state, module) 2022-11-23T02:58:20.2686375Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2686514Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2686841Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2686960Z return func(*args, **kwargs) 2022-11-23T02:58:20.2687340Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2687435Z p_assert( 2022-11-23T02:58:20.2687770Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2687892Z traceback.print_stack() 2022-11-23T02:58:20.2688022Z File "", line 1, in 2022-11-23T02:58:20.2688217Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2688354Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2688550Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2688694Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2688904Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2689001Z self.run() 2022-11-23T02:58:20.2689201Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2689346Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2689675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2689808Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2690168Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2690293Z getattr(self, test_name)() 2022-11-23T02:58:20.2690656Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2690801Z fn() 2022-11-23T02:58:20.2691178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2691297Z test(self, **param_kwargs) 2022-11-23T02:58:20.2691642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2691766Z return func(*args, **kwargs) 2022-11-23T02:58:20.2692002Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2692108Z self.run_subtests( 2022-11-23T02:58:20.2692461Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2692669Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2693038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2693189Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2693554Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2693667Z output = model(*input) 2022-11-23T02:58:20.2693990Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2694125Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2694503Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2694676Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2695048Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2695166Z _lazy_init(state, module) 2022-11-23T02:58:20.2695511Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2695653Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2695991Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2696112Z return func(*args, **kwargs) 2022-11-23T02:58:20.2696486Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2696579Z p_assert( 2022-11-23T02:58:20.2696914Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2697035Z traceback.print_stack() 2022-11-23T02:58:20.2697264Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2697498Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2697732Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2697965Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2698191Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2698422Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2698648Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2698876Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2699090Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2699321Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2699543Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2699821Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2700054Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2700280Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2700505Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2700733Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2700943Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2701168Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2701449Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2701676Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2701901Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2702121Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2702343Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2702567Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2702790Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2702999Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2703227Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2703450Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2703678Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2703901Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2704124Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2704347Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2704573Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2704783Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2705010Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2705235Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2705458Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2705683Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2705902Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2706119Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2706343Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2706554Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2706774Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2706994Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2707219Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2707439Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2707707Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2707934Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2708040Z dist init r=0, world=2 2022-11-23T02:58:20.2708378Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2708687Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2709275Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2709684Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2710012Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2710326Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2710634Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2710937Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2711243Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2711549Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2711852Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2712151Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2712245Z dist init r=1, world=2 2022-11-23T02:58:20.2712553Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2712864Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2713171Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2713473Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2713773Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2714075Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2714378Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2714732Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2715039Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2715344Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2715649Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2715942Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2716099Z ok (10.221s) 2022-11-23T02:58:20.2716440Z test_transformer_offload_true_none_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92589 2022-11-23T02:58:20.2716659Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92590 2022-11-23T02:58:20.2717055Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.2717226Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.2717612Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.2717799Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.2718173Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.2718336Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.2718718Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.2718910Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.2719152Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.2719394Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.2719798Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.2720192Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.2720421Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.2720651Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.2720873Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2721111Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2722144Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.2722255Z warnings.warn( 2022-11-23T02:58:20.2723321Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.2723435Z warnings.warn( 2022-11-23T02:58:20.2723561Z File "", line 1, in 2022-11-23T02:58:20.2723775Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2723917Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2724116Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2724251Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2724466Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2724613Z self.run() 2022-11-23T02:58:20.2724811Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2724954Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2725301Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2725433Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2725802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2725909Z getattr(self, test_name)() 2022-11-23T02:58:20.2726272Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2726363Z fn() 2022-11-23T02:58:20.2726732Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2726851Z test(self, **param_kwargs) 2022-11-23T02:58:20.2727219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2727341Z return func(*args, **kwargs) 2022-11-23T02:58:20.2727579Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2727675Z self.run_subtests( 2022-11-23T02:58:20.2728028Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2728189Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2728550Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2728697Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2729074Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2729194Z output = model(*input) 2022-11-23T02:58:20.2729519Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2729644Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2730030Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2730207Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2730574Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2730689Z _lazy_init(state, module) 2022-11-23T02:58:20.2731037Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2731172Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2731504Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2731617Z return func(*args, **kwargs) 2022-11-23T02:58:20.2731992Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2732088Z p_assert( 2022-11-23T02:58:20.2732477Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2732604Z traceback.print_stack() 2022-11-23T02:58:20.2732729Z File "", line 1, in 2022-11-23T02:58:20.2732935Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2733062Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2733259Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2733405Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2733616Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2733765Z self.run() 2022-11-23T02:58:20.2733962Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2734101Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2734449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2734565Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2734931Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2735047Z getattr(self, test_name)() 2022-11-23T02:58:20.2735406Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2735499Z fn() 2022-11-23T02:58:20.2735867Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2735986Z test(self, **param_kwargs) 2022-11-23T02:58:20.2736347Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2736456Z return func(*args, **kwargs) 2022-11-23T02:58:20.2736696Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2736803Z self.run_subtests( 2022-11-23T02:58:20.2737162Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2737319Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2737682Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2737828Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2738206Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2738313Z output = model(*input) 2022-11-23T02:58:20.2738640Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2738776Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2739156Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2739330Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2739699Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2739815Z _lazy_init(state, module) 2022-11-23T02:58:20.2740166Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2740295Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2740628Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2740754Z return func(*args, **kwargs) 2022-11-23T02:58:20.2741132Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2741229Z p_assert( 2022-11-23T02:58:20.2741612Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2741744Z traceback.print_stack() 2022-11-23T02:58:20.2741985Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2742205Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2742338Z File "", line 1, in 2022-11-23T02:58:20.2742538Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2742678Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2742879Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2743076Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2743289Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2743375Z self.run() 2022-11-23T02:58:20.2743582Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2743723Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2744072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2744197Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2744557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2744680Z getattr(self, test_name)() 2022-11-23T02:58:20.2745043Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2745128Z fn() 2022-11-23T02:58:20.2745496Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2745616Z test(self, **param_kwargs) 2022-11-23T02:58:20.2745979Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2746098Z return func(*args, **kwargs) 2022-11-23T02:58:20.2746339Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2746451Z self.run_subtests( 2022-11-23T02:58:20.2746811Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2746957Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2747323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2747474Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2747851Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2747968Z output = model(*input) 2022-11-23T02:58:20.2748297Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2748433Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2748813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2749229Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2749619Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2749740Z _lazy_init(state, module) 2022-11-23T02:58:20.2750094Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2750240Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2750617Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2750815Z return func(*args, **kwargs) 2022-11-23T02:58:20.2751211Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2751297Z p_assert( 2022-11-23T02:58:20.2751636Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2751757Z traceback.print_stack() 2022-11-23T02:58:20.2751880Z File "", line 1, in 2022-11-23T02:58:20.2752091Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2752230Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2752432Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2752646Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2752847Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2752946Z self.run() 2022-11-23T02:58:20.2753152Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2753294Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2753639Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2753769Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2754130Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2754237Z getattr(self, test_name)() 2022-11-23T02:58:20.2754597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2754694Z fn() 2022-11-23T02:58:20.2755060Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2755179Z test(self, **param_kwargs) 2022-11-23T02:58:20.2755541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2755663Z return func(*args, **kwargs) 2022-11-23T02:58:20.2755903Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2756000Z self.run_subtests( 2022-11-23T02:58:20.2756354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2756512Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2756880Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2757027Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2757399Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2757515Z output = model(*input) 2022-11-23T02:58:20.2757902Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2758028Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2758407Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2758579Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2758952Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2759067Z _lazy_init(state, module) 2022-11-23T02:58:20.2759414Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2759560Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2759901Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2760059Z return func(*args, **kwargs) 2022-11-23T02:58:20.2760447Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2760546Z p_assert( 2022-11-23T02:58:20.2760881Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2761005Z traceback.print_stack() 2022-11-23T02:58:20.2761246Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2761483Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2761608Z File "", line 1, in 2022-11-23T02:58:20.2761856Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2761990Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2762194Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2762346Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2762556Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2762653Z self.run() 2022-11-23T02:58:20.2762852Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2762997Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2763330Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2763460Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2763820Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2763944Z getattr(self, test_name)() 2022-11-23T02:58:20.2764305Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2764399Z fn() 2022-11-23T02:58:20.2764769Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2764891Z test(self, **param_kwargs) 2022-11-23T02:58:20.2765242Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2765369Z return func(*args, **kwargs) 2022-11-23T02:58:20.2765606Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2765717Z self.run_subtests( 2022-11-23T02:58:20.2766067Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2766230Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2766592Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2766741Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2767111Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2767230Z output = model(*input) 2022-11-23T02:58:20.2767554Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2767693Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2768066Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2768242Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2768616Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2768735Z _lazy_init(state, module) 2022-11-23T02:58:20.2769074Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2769265Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2769614Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2769738Z return func(*args, **kwargs) 2022-11-23T02:58:20.2770120Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2770219Z p_assert( 2022-11-23T02:58:20.2770556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2770679Z traceback.print_stack() 2022-11-23T02:58:20.2770792Z File "", line 1, in 2022-11-23T02:58:20.2771046Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2771182Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2771388Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2771540Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2771749Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2771848Z self.run() 2022-11-23T02:58:20.2772036Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2772181Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2772519Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2772650Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2773017Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2773139Z getattr(self, test_name)() 2022-11-23T02:58:20.2773497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2773588Z fn() 2022-11-23T02:58:20.2773944Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2774059Z test(self, **param_kwargs) 2022-11-23T02:58:20.2774412Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2774530Z return func(*args, **kwargs) 2022-11-23T02:58:20.2774766Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2774874Z self.run_subtests( 2022-11-23T02:58:20.2775226Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2775393Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2775745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2775901Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2776276Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2776395Z output = model(*input) 2022-11-23T02:58:20.2776722Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2776859Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2777239Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2777415Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2777780Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2777897Z _lazy_init(state, module) 2022-11-23T02:58:20.2778247Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2778436Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2778780Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2778902Z return func(*args, **kwargs) 2022-11-23T02:58:20.2779280Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2779376Z p_assert( 2022-11-23T02:58:20.2779699Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2779819Z traceback.print_stack() 2022-11-23T02:58:20.2780058Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2780357Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2780479Z File "", line 1, in 2022-11-23T02:58:20.2780690Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2780825Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2781022Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2781157Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2781366Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2781463Z self.run() 2022-11-23T02:58:20.2781667Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2781810Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2782156Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2782289Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2782642Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2782763Z getattr(self, test_name)() 2022-11-23T02:58:20.2783125Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2783217Z fn() 2022-11-23T02:58:20.2783581Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2783700Z test(self, **param_kwargs) 2022-11-23T02:58:20.2784058Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2784177Z return func(*args, **kwargs) 2022-11-23T02:58:20.2784402Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2784513Z self.run_subtests( 2022-11-23T02:58:20.2784862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2785022Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2785386Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2785533Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2785905Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2786021Z output = model(*input) 2022-11-23T02:58:20.2786337Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2786474Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2786852Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2787022Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2787435Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2787555Z _lazy_init(state, module) 2022-11-23T02:58:20.2787907Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2788046Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2788372Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2788492Z return func(*args, **kwargs) 2022-11-23T02:58:20.2788867Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2789203Z p_assert( 2022-11-23T02:58:20.2789558Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2789683Z traceback.print_stack() 2022-11-23T02:58:20.2789808Z File "", line 1, in 2022-11-23T02:58:20.2790020Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2790148Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2790346Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2790494Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2790707Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2790805Z self.run() 2022-11-23T02:58:20.2791005Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2791145Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2791474Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2791606Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2791968Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2792092Z getattr(self, test_name)() 2022-11-23T02:58:20.2792452Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2792544Z fn() 2022-11-23T02:58:20.2792910Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2793032Z test(self, **param_kwargs) 2022-11-23T02:58:20.2793376Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2793501Z return func(*args, **kwargs) 2022-11-23T02:58:20.2793736Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2793845Z self.run_subtests( 2022-11-23T02:58:20.2794198Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2794359Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2794726Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2794874Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2795237Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2795353Z output = model(*input) 2022-11-23T02:58:20.2795678Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2795812Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2796192Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2796368Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2796811Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2796938Z _lazy_init(state, module) 2022-11-23T02:58:20.2797280Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2797419Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2797759Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2797878Z return func(*args, **kwargs) 2022-11-23T02:58:20.2798263Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2798422Z p_assert( 2022-11-23T02:58:20.2798767Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2798893Z traceback.print_stack() 2022-11-23T02:58:20.2799124Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2799362Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2799489Z File "", line 1, in 2022-11-23T02:58:20.2799695Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2799830Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2800031Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2800180Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2800389Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2800480Z self.run() 2022-11-23T02:58:20.2800683Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2800829Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2801178Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2801307Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2801675Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2801793Z getattr(self, test_name)() 2022-11-23T02:58:20.2802152Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2802232Z fn() 2022-11-23T02:58:20.2802599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2802719Z test(self, **param_kwargs) 2022-11-23T02:58:20.2803084Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2803207Z return func(*args, **kwargs) 2022-11-23T02:58:20.2803440Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2803550Z self.run_subtests( 2022-11-23T02:58:20.2803894Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2804055Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2804417Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2804565Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2804941Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2805058Z output = model(*input) 2022-11-23T02:58:20.2805387Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2805526Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2805997Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2806166Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2806544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2806661Z _lazy_init(state, module) 2022-11-23T02:58:20.2807016Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2807158Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2807496Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2807667Z return func(*args, **kwargs) 2022-11-23T02:58:20.2808056Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2808140Z p_assert( 2022-11-23T02:58:20.2808481Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2808606Z traceback.print_stack() 2022-11-23T02:58:20.2808732Z File "", line 1, in 2022-11-23T02:58:20.2808941Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2809076Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2809279Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2809413Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2809627Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2809732Z self.run() 2022-11-23T02:58:20.2809936Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2810081Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2810428Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2810558Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2810924Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2811031Z getattr(self, test_name)() 2022-11-23T02:58:20.2811396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2811491Z fn() 2022-11-23T02:58:20.2811859Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2811974Z test(self, **param_kwargs) 2022-11-23T02:58:20.2812339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2812461Z return func(*args, **kwargs) 2022-11-23T02:58:20.2812704Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2812801Z self.run_subtests( 2022-11-23T02:58:20.2813154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2813309Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2813674Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2813827Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2814200Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2814321Z output = model(*input) 2022-11-23T02:58:20.2814644Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2814770Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2815197Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2815373Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2815743Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2815856Z _lazy_init(state, module) 2022-11-23T02:58:20.2816206Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2816345Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2816685Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2816842Z return func(*args, **kwargs) 2022-11-23T02:58:20.2817229Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2817328Z p_assert( 2022-11-23T02:58:20.2817670Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2817794Z traceback.print_stack() 2022-11-23T02:58:20.2818027Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2818258Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2818385Z File "", line 1, in 2022-11-23T02:58:20.2818581Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2818716Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2818916Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2819067Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2819276Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2819374Z self.run() 2022-11-23T02:58:20.2819573Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2819706Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2820047Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2820172Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2820532Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2820651Z getattr(self, test_name)() 2022-11-23T02:58:20.2821009Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2821104Z fn() 2022-11-23T02:58:20.2821471Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2821577Z test(self, **param_kwargs) 2022-11-23T02:58:20.2821939Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2822059Z return func(*args, **kwargs) 2022-11-23T02:58:20.2822296Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2822403Z self.run_subtests( 2022-11-23T02:58:20.2822754Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2822912Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2823279Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2823420Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2823801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2823918Z output = model(*input) 2022-11-23T02:58:20.2824289Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2824436Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2824815Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2824986Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2825357Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2825462Z _lazy_init(state, module) 2022-11-23T02:58:20.2825813Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2826000Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2826340Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2826462Z return func(*args, **kwargs) 2022-11-23T02:58:20.2826844Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2826943Z p_assert( 2022-11-23T02:58:20.2827278Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2827388Z traceback.print_stack() 2022-11-23T02:58:20.2827510Z File "", line 1, in 2022-11-23T02:58:20.2827713Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2827849Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2828046Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2828193Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2828398Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2828484Z self.run() 2022-11-23T02:58:20.2828686Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2828829Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2829354Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2829482Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2829844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2829959Z getattr(self, test_name)() 2022-11-23T02:58:20.2830315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2830402Z fn() 2022-11-23T02:58:20.2830763Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2830877Z test(self, **param_kwargs) 2022-11-23T02:58:20.2831234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2831352Z return func(*args, **kwargs) 2022-11-23T02:58:20.2831589Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2831696Z self.run_subtests( 2022-11-23T02:58:20.2832049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2832194Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2832558Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2832711Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2833086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2833201Z output = model(*input) 2022-11-23T02:58:20.2833601Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2833750Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2834132Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2834292Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2834663Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2834775Z _lazy_init(state, module) 2022-11-23T02:58:20.2835125Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2835333Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2835678Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2835799Z return func(*args, **kwargs) 2022-11-23T02:58:20.2836185Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2836272Z p_assert( 2022-11-23T02:58:20.2836606Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2836728Z traceback.print_stack() 2022-11-23T02:58:20.2836966Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2837203Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2837332Z File "", line 1, in 2022-11-23T02:58:20.2837541Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2837679Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2837867Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2838013Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2838221Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2838317Z self.run() 2022-11-23T02:58:20.2838515Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2838655Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2838996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2839121Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2839470Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2839589Z getattr(self, test_name)() 2022-11-23T02:58:20.2839949Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2840042Z fn() 2022-11-23T02:58:20.2840418Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2840537Z test(self, **param_kwargs) 2022-11-23T02:58:20.2840893Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2841001Z return func(*args, **kwargs) 2022-11-23T02:58:20.2841236Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2841344Z self.run_subtests( 2022-11-23T02:58:20.2841698Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2841854Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2842219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2842366Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2842790Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2842898Z output = model(*input) 2022-11-23T02:58:20.2843233Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2843369Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2843744Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2843910Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2844276Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2844464Z _lazy_init(state, module) 2022-11-23T02:58:20.2844823Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2844970Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2845295Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2845421Z return func(*args, **kwargs) 2022-11-23T02:58:20.2845803Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2845906Z p_assert( 2022-11-23T02:58:20.2846243Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2846364Z traceback.print_stack() 2022-11-23T02:58:20.2846485Z File "", line 1, in 2022-11-23T02:58:20.2846682Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2846821Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2847026Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2847175Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2847385Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2847486Z self.run() 2022-11-23T02:58:20.2847684Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2847823Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2848155Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2848280Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2848641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2848763Z getattr(self, test_name)() 2022-11-23T02:58:20.2849120Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2849211Z fn() 2022-11-23T02:58:20.2849580Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2849701Z test(self, **param_kwargs) 2022-11-23T02:58:20.2850047Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2850171Z return func(*args, **kwargs) 2022-11-23T02:58:20.2850406Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2850556Z self.run_subtests( 2022-11-23T02:58:20.2850916Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2851083Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2851450Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2851598Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2852016Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2852139Z output = model(*input) 2022-11-23T02:58:20.2852468Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2852604Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2852984Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2853161Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2853525Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2853691Z _lazy_init(state, module) 2022-11-23T02:58:20.2854035Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2854180Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2854520Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2854639Z return func(*args, **kwargs) 2022-11-23T02:58:20.2855018Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2855113Z p_assert( 2022-11-23T02:58:20.2855447Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2855569Z traceback.print_stack() 2022-11-23T02:58:20.2855793Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2856028Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2856152Z File "", line 1, in 2022-11-23T02:58:20.2856359Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2856495Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2856693Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2856835Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2857032Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2857125Z self.run() 2022-11-23T02:58:20.2857325Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2857470Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2857817Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2857948Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2858312Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2858433Z getattr(self, test_name)() 2022-11-23T02:58:20.2858782Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2858878Z fn() 2022-11-23T02:58:20.2859245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2859364Z test(self, **param_kwargs) 2022-11-23T02:58:20.2859724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2859847Z return func(*args, **kwargs) 2022-11-23T02:58:20.2860086Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2860203Z self.run_subtests( 2022-11-23T02:58:20.2860546Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2860763Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2861144Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2861293Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2861664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2861777Z output = model(*input) 2022-11-23T02:58:20.2862100Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2862236Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2862601Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2862825Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2863197Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2863314Z _lazy_init(state, module) 2022-11-23T02:58:20.2863671Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2863805Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2864144Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2864264Z return func(*args, **kwargs) 2022-11-23T02:58:20.2864632Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2864731Z p_assert( 2022-11-23T02:58:20.2865071Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2865195Z traceback.print_stack() 2022-11-23T02:58:20.2865319Z File "", line 1, in 2022-11-23T02:58:20.2865530Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2865670Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2865871Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2866005Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2866216Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2866315Z self.run() 2022-11-23T02:58:20.2866518Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2866664Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2867008Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2867141Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2867492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2867616Z getattr(self, test_name)() 2022-11-23T02:58:20.2867982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2868076Z fn() 2022-11-23T02:58:20.2868443Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2868563Z test(self, **param_kwargs) 2022-11-23T02:58:20.2868920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2869224Z return func(*args, **kwargs) 2022-11-23T02:58:20.2869454Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2869573Z self.run_subtests( 2022-11-23T02:58:20.2869930Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2870160Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2870540Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2870693Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2871070Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2871188Z output = model(*input) 2022-11-23T02:58:20.2871502Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2871642Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2872020Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2872262Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2872642Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2872758Z _lazy_init(state, module) 2022-11-23T02:58:20.2873105Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2873244Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2873568Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2873690Z return func(*args, **kwargs) 2022-11-23T02:58:20.2874072Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2874166Z p_assert( 2022-11-23T02:58:20.2874508Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2874628Z traceback.print_stack() 2022-11-23T02:58:20.2874865Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2875105Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2875219Z File "", line 1, in 2022-11-23T02:58:20.2875430Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2875569Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2875773Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2875921Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2876134Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2876232Z self.run() 2022-11-23T02:58:20.2876441Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2876571Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2876917Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2877052Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2877422Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2877542Z getattr(self, test_name)() 2022-11-23T02:58:20.2877901Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2877993Z fn() 2022-11-23T02:58:20.2878346Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2878462Z test(self, **param_kwargs) 2022-11-23T02:58:20.2878820Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2878946Z return func(*args, **kwargs) 2022-11-23T02:58:20.2879181Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2879339Z self.run_subtests( 2022-11-23T02:58:20.2879700Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2879856Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2880209Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2880362Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2880739Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2880856Z output = model(*input) 2022-11-23T02:58:20.2881233Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2881369Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2881751Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2881927Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2882300Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2882404Z _lazy_init(state, module) 2022-11-23T02:58:20.2882758Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2882893Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2883227Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2883348Z return func(*args, **kwargs) 2022-11-23T02:58:20.2883728Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2883823Z p_assert( 2022-11-23T02:58:20.2884152Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2884275Z traceback.print_stack() 2022-11-23T02:58:20.2884396Z File "", line 1, in 2022-11-23T02:58:20.2884605Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2884745Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2884948Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2885096Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2885302Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2885392Z self.run() 2022-11-23T02:58:20.2885591Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2885733Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2886079Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2886207Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2886569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2886689Z getattr(self, test_name)() 2022-11-23T02:58:20.2887049Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2887130Z fn() 2022-11-23T02:58:20.2887491Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2887608Z test(self, **param_kwargs) 2022-11-23T02:58:20.2887971Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2888091Z return func(*args, **kwargs) 2022-11-23T02:58:20.2888327Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2888485Z self.run_subtests( 2022-11-23T02:58:20.2888835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2888997Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2889360Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2889504Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2889882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2890001Z output = model(*input) 2022-11-23T02:58:20.2890382Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2890521Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2890901Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2891063Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2891434Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2891551Z _lazy_init(state, module) 2022-11-23T02:58:20.2891899Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2892035Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2892369Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2892498Z return func(*args, **kwargs) 2022-11-23T02:58:20.2892883Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2892968Z p_assert( 2022-11-23T02:58:20.2893303Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2893424Z traceback.print_stack() 2022-11-23T02:58:20.2893661Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2893894Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2894015Z File "", line 1, in 2022-11-23T02:58:20.2894218Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2894345Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2894547Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2894695Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2894905Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2895006Z self.run() 2022-11-23T02:58:20.2895208Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2895354Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2895699Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2895816Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2896184Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2896304Z getattr(self, test_name)() 2022-11-23T02:58:20.2896662Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2896757Z fn() 2022-11-23T02:58:20.2897123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2897241Z test(self, **param_kwargs) 2022-11-23T02:58:20.2897647Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2897762Z return func(*args, **kwargs) 2022-11-23T02:58:20.2897999Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2898110Z self.run_subtests( 2022-11-23T02:58:20.2898466Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2898626Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2898992Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2899192Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2899569Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2899673Z output = model(*input) 2022-11-23T02:58:20.2900002Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2900138Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2900517Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2900684Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2901050Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2901168Z _lazy_init(state, module) 2022-11-23T02:58:20.2901519Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2901650Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2901991Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2902112Z return func(*args, **kwargs) 2022-11-23T02:58:20.2902496Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2902589Z p_assert( 2022-11-23T02:58:20.2902928Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2903048Z traceback.print_stack() 2022-11-23T02:58:20.2903175Z File "", line 1, in 2022-11-23T02:58:20.2903371Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2903512Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2903714Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2903865Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2904075Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2904171Z self.run() 2022-11-23T02:58:20.2904377Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2904507Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2904847Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2904974Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2905339Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2905457Z getattr(self, test_name)() 2022-11-23T02:58:20.2905819Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2905919Z fn() 2022-11-23T02:58:20.2906291Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2906396Z test(self, **param_kwargs) 2022-11-23T02:58:20.2906801Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2906927Z return func(*args, **kwargs) 2022-11-23T02:58:20.2907165Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2907276Z self.run_subtests( 2022-11-23T02:58:20.2907630Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2907789Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2908154Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2908353Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2908731Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2908844Z output = model(*input) 2022-11-23T02:58:20.2909427Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2909571Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2909958Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2910134Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2910504Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2910609Z _lazy_init(state, module) 2022-11-23T02:58:20.2910961Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2911106Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2911441Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2911563Z return func(*args, **kwargs) 2022-11-23T02:58:20.2911945Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2912042Z p_assert( 2022-11-23T02:58:20.2912379Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2912489Z traceback.print_stack() 2022-11-23T02:58:20.2912724Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2912956Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2913081Z File "", line 1, in 2022-11-23T02:58:20.2913293Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2913433Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2913631Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2913777Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2913976Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2914072Z self.run() 2022-11-23T02:58:20.2914268Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2914412Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2914758Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2914882Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2915242Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2915353Z getattr(self, test_name)() 2022-11-23T02:58:20.2915710Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2915799Z fn() 2022-11-23T02:58:20.2916235Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2916363Z test(self, **param_kwargs) 2022-11-23T02:58:20.2916722Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2916841Z return func(*args, **kwargs) 2022-11-23T02:58:20.2917078Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2917174Z self.run_subtests( 2022-11-23T02:58:20.2917528Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2917753Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2918116Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2918265Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2918647Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2918766Z output = model(*input) 2022-11-23T02:58:20.2919092Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2919217Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2919593Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2919762Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2920127Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2920247Z _lazy_init(state, module) 2022-11-23T02:58:20.2920599Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2920738Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2921073Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2921181Z return func(*args, **kwargs) 2022-11-23T02:58:20.2921554Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2921649Z p_assert( 2022-11-23T02:58:20.2921982Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2922102Z traceback.print_stack() 2022-11-23T02:58:20.2922223Z File "", line 1, in 2022-11-23T02:58:20.2922434Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2922570Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2922760Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2922909Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2923120Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2923217Z self.run() 2022-11-23T02:58:20.2923414Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2923558Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2923900Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2924025Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2924374Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2924499Z getattr(self, test_name)() 2022-11-23T02:58:20.2924859Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2924953Z fn() 2022-11-23T02:58:20.2925366Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2925487Z test(self, **param_kwargs) 2022-11-23T02:58:20.2925848Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2925956Z return func(*args, **kwargs) 2022-11-23T02:58:20.2926196Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2926304Z self.run_subtests( 2022-11-23T02:58:20.2926657Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2926865Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2927228Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2927378Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2927754Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2927865Z output = model(*input) 2022-11-23T02:58:20.2928179Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2928318Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2928692Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2928865Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2929233Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2929356Z _lazy_init(state, module) 2022-11-23T02:58:20.2929709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2929853Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2930177Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2930302Z return func(*args, **kwargs) 2022-11-23T02:58:20.2930679Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2930781Z p_assert( 2022-11-23T02:58:20.2931121Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2931239Z traceback.print_stack() 2022-11-23T02:58:20.2931473Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2931698Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2931825Z File "", line 1, in 2022-11-23T02:58:20.2932035Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2932177Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2932377Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2932531Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2932739Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2932834Z self.run() 2022-11-23T02:58:20.2933023Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2933166Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2933513Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2933646Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2934007Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2934129Z getattr(self, test_name)() 2022-11-23T02:58:20.2934535Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2934636Z fn() 2022-11-23T02:58:20.2934996Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2935116Z test(self, **param_kwargs) 2022-11-23T02:58:20.2935472Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2935593Z return func(*args, **kwargs) 2022-11-23T02:58:20.2935829Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2935985Z self.run_subtests( 2022-11-23T02:58:20.2936341Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2936498Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2936852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2936998Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2937378Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2937496Z output = model(*input) 2022-11-23T02:58:20.2937822Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2937958Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2938336Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2938516Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2938878Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2938999Z _lazy_init(state, module) 2022-11-23T02:58:20.2939349Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2939491Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2939830Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2939955Z return func(*args, **kwargs) 2022-11-23T02:58:20.2940337Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2940438Z p_assert( 2022-11-23T02:58:20.2940769Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2940894Z traceback.print_stack() 2022-11-23T02:58:20.2941021Z File "", line 1, in 2022-11-23T02:58:20.2941231Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2941367Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2941572Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2941720Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2941919Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2942018Z self.run() 2022-11-23T02:58:20.2942219Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2942364Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2942708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2942840Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2943202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2943365Z getattr(self, test_name)() 2022-11-23T02:58:20.2943723Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2943815Z fn() 2022-11-23T02:58:20.2944174Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2944291Z test(self, **param_kwargs) 2022-11-23T02:58:20.2944646Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2944770Z return func(*args, **kwargs) 2022-11-23T02:58:20.2945004Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2945215Z self.run_subtests( 2022-11-23T02:58:20.2945561Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2945724Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2946090Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2946237Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2946613Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2946727Z output = model(*input) 2022-11-23T02:58:20.2947052Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2947191Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2947556Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2947734Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2948106Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2948221Z _lazy_init(state, module) 2022-11-23T02:58:20.2948572Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2948712Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2949237Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2949367Z return func(*args, **kwargs) 2022-11-23T02:58:20.2949743Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2949843Z p_assert( 2022-11-23T02:58:20.2950186Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2950306Z traceback.print_stack() 2022-11-23T02:58:20.2950545Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2950816Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2951049Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2951282Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2951498Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2951725Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2952496Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.2953325Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.2954091Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.2954844Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.2955658Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.2956397Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.2956634Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2956873Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2957111Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2957345Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2957572Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2957800Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2958017Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2958246Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2958471Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2958697Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2958933Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2959157Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2959912Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.2960664Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.2960889Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2961126Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2961341Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2961620Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2961852Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2962080Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2962306Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2962534Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2962758Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2962984Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2963254Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2963468Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2964227Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.2964981Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.2965206Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2965434Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2965664Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2965896Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2966125Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2966351Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2966577Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2966789Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2967018Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2967243Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2967472Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2967694Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2968444Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.2969186Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.2969415Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2969646Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2969872Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2970133Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2970361Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2970587Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2970697Z dist init r=0, world=2 2022-11-23T02:58:20.2971033Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2971357Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2971724Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2972038Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2972352Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2972654Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2972945Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2973251Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2973555Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2973857Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2974155Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2974455Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.2974559Z dist init r=1, world=2 2022-11-23T02:58:20.2974892Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2975214Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2975530Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2975836Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2976131Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2976430Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2976742Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2977088Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2977400Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2977704Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2978013Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2978366Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.2978461Z ok (10.621s) 2022-11-23T02:58:20.2978812Z test_transformer_offload_true_shard_grad_op_norm_type_None (__main__.TestParityWithDDP) ... INFO:torch.testing._internal.common_distributed:Started process 0 with pid 92672 2022-11-23T02:58:20.2979031Z INFO:torch.testing._internal.common_distributed:Started process 1 with pid 92673 2022-11-23T02:58:20.2979402Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.2979577Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.2979959Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.2980158Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.2980533Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:121: UserWarning: loaded 51 slow tests 2022-11-23T02:58:20.2980705Z warnings.warn(f"loaded {len(slow_tests_dict)} slow tests") 2022-11-23T02:58:20.2981087Z /opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:125: UserWarning: loaded 420 disabled tests 2022-11-23T02:58:20.2981276Z warnings.warn(f"loaded {len(disabled_tests_dict)} disabled tests") 2022-11-23T02:58:20.2981525Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 1 2022-11-23T02:58:20.2981757Z INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0 2022-11-23T02:58:20.2982163Z INFO:torch.distributed.distributed_c10d:Rank 1: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.2982557Z INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 2 nodes. 2022-11-23T02:58:20.2982789Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 0 2022-11-23T02:58:20.2983020Z INFO:torch.testing._internal.common_distributed:Starting event listener thread for rank 1 2022-11-23T02:58:20.2983259Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2983495Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.2984529Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.2984644Z warnings.warn( 2022-11-23T02:58:20.2985724Z /opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:608: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2022-11-23T02:58:20.2985837Z warnings.warn( 2022-11-23T02:58:20.2985954Z File "", line 1, in 2022-11-23T02:58:20.2986167Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2986307Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2986506Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2986657Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2986920Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2987020Z self.run() 2022-11-23T02:58:20.2987209Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2987357Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2987706Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2987836Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2988203Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2988322Z getattr(self, test_name)() 2022-11-23T02:58:20.2988683Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2988774Z fn() 2022-11-23T02:58:20.2989372Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2989507Z test(self, **param_kwargs) 2022-11-23T02:58:20.2989870Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2989999Z return func(*args, **kwargs) 2022-11-23T02:58:20.2990241Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2990353Z self.run_subtests( 2022-11-23T02:58:20.2990707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.2990866Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.2991219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.2991367Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.2991745Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.2991860Z output = model(*input) 2022-11-23T02:58:20.2992188Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.2992332Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.2992709Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.2992878Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.2993236Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.2993355Z _lazy_init(state, module) 2022-11-23T02:58:20.2993710Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.2993853Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.2994193Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.2994313Z return func(*args, **kwargs) 2022-11-23T02:58:20.2994772Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.2994874Z p_assert( 2022-11-23T02:58:20.2995206Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.2995335Z traceback.print_stack() 2022-11-23T02:58:20.2995464Z File "", line 1, in 2022-11-23T02:58:20.2995673Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.2995813Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.2996018Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.2996232Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.2996450Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.2996537Z self.run() 2022-11-23T02:58:20.2996738Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.2996885Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.2997233Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.2997363Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.2997727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.2997845Z getattr(self, test_name)() 2022-11-23T02:58:20.2998195Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.2998285Z fn() 2022-11-23T02:58:20.2998650Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.2998775Z test(self, **param_kwargs) 2022-11-23T02:58:20.2999134Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.2999256Z return func(*args, **kwargs) 2022-11-23T02:58:20.2999496Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.2999603Z self.run_subtests( 2022-11-23T02:58:20.2999947Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3000103Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3000469Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3000616Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3001000Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3001116Z output = model(*input) 2022-11-23T02:58:20.3001446Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3001584Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3001952Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3002124Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3002499Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3002621Z _lazy_init(state, module) 2022-11-23T02:58:20.3002975Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3003119Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3003459Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3003580Z return func(*args, **kwargs) 2022-11-23T02:58:20.3003997Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3004102Z p_assert( 2022-11-23T02:58:20.3004442Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3004563Z traceback.print_stack() 2022-11-23T02:58:20.3004803Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3005041Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3005163Z File "", line 1, in 2022-11-23T02:58:20.3005367Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3005543Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3005746Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3005898Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3006116Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3006216Z self.run() 2022-11-23T02:58:20.3006413Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3006552Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3006896Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3007014Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3007376Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3007497Z getattr(self, test_name)() 2022-11-23T02:58:20.3007862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3007957Z fn() 2022-11-23T02:58:20.3008327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3008448Z test(self, **param_kwargs) 2022-11-23T02:58:20.3008806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3008914Z return func(*args, **kwargs) 2022-11-23T02:58:20.3009146Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3009256Z self.run_subtests( 2022-11-23T02:58:20.3009606Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3009767Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3010140Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3010287Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3010664Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3010768Z output = model(*input) 2022-11-23T02:58:20.3011094Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3011232Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3011609Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3011781Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3012153Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3012278Z _lazy_init(state, module) 2022-11-23T02:58:20.3012634Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3012760Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3013147Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3013273Z return func(*args, **kwargs) 2022-11-23T02:58:20.3013657Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3013751Z p_assert( 2022-11-23T02:58:20.3014086Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3014210Z traceback.print_stack() 2022-11-23T02:58:20.3014323Z File "", line 1, in 2022-11-23T02:58:20.3014531Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3014719Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3014918Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3015066Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3015283Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3015385Z self.run() 2022-11-23T02:58:20.3015588Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3015718Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3016064Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3016194Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3016557Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3016677Z getattr(self, test_name)() 2022-11-23T02:58:20.3017038Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3017126Z fn() 2022-11-23T02:58:20.3017492Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3017598Z test(self, **param_kwargs) 2022-11-23T02:58:20.3017957Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3018075Z return func(*args, **kwargs) 2022-11-23T02:58:20.3018315Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3018426Z self.run_subtests( 2022-11-23T02:58:20.3018779Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3018935Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3019302Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3019439Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3019816Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3019931Z output = model(*input) 2022-11-23T02:58:20.3020260Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3020398Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3020775Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3020950Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3021316Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3021423Z _lazy_init(state, module) 2022-11-23T02:58:20.3021777Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3021918Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3022299Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3022425Z return func(*args, **kwargs) 2022-11-23T02:58:20.3022809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3022907Z p_assert( 2022-11-23T02:58:20.3023243Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3023353Z traceback.print_stack() 2022-11-23T02:58:20.3023592Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3023879Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3024008Z File "", line 1, in 2022-11-23T02:58:20.3024221Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3024364Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3024571Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3024723Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3024921Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3025018Z self.run() 2022-11-23T02:58:20.3025220Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3025362Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3025708Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3025838Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3026202Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3026310Z getattr(self, test_name)() 2022-11-23T02:58:20.3026676Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3026769Z fn() 2022-11-23T02:58:20.3027129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3027246Z test(self, **param_kwargs) 2022-11-23T02:58:20.3027600Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3027720Z return func(*args, **kwargs) 2022-11-23T02:58:20.3027954Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3028055Z self.run_subtests( 2022-11-23T02:58:20.3028410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3028568Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3029103Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3029272Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3029658Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3029774Z output = model(*input) 2022-11-23T02:58:20.3030098Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3030223Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3030599Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3030784Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3031152Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3031335Z _lazy_init(state, module) 2022-11-23T02:58:20.3031699Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3031843Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3032184Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3032295Z return func(*args, **kwargs) 2022-11-23T02:58:20.3032675Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3032773Z p_assert( 2022-11-23T02:58:20.3033112Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3033325Z traceback.print_stack() 2022-11-23T02:58:20.3033452Z File "", line 1, in 2022-11-23T02:58:20.3033660Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3033801Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3033990Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3034137Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3034347Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3034441Z self.run() 2022-11-23T02:58:20.3034637Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3034775Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3035123Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3035243Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3035607Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3035725Z getattr(self, test_name)() 2022-11-23T02:58:20.3036086Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3036178Z fn() 2022-11-23T02:58:20.3036541Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3036660Z test(self, **param_kwargs) 2022-11-23T02:58:20.3037019Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3037128Z return func(*args, **kwargs) 2022-11-23T02:58:20.3037366Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3037479Z self.run_subtests( 2022-11-23T02:58:20.3037835Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3037992Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3038362Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3038510Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3038886Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3038990Z output = model(*input) 2022-11-23T02:58:20.3039316Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3039453Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3039826Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3040009Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3040380Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3040546Z _lazy_init(state, module) 2022-11-23T02:58:20.3040905Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3041033Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3041366Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3041482Z return func(*args, **kwargs) 2022-11-23T02:58:20.3041861Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3041960Z p_assert( 2022-11-23T02:58:20.3042293Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3042467Z traceback.print_stack() 2022-11-23T02:58:20.3042705Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3042929Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3043054Z File "", line 1, in 2022-11-23T02:58:20.3043263Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3043402Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3043603Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3043752Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3043963Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3044063Z self.run() 2022-11-23T02:58:20.3044250Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3044395Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3044738Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3044864Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3045234Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3045354Z getattr(self, test_name)() 2022-11-23T02:58:20.3045712Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3045805Z fn() 2022-11-23T02:58:20.3046157Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3046274Z test(self, **param_kwargs) 2022-11-23T02:58:20.3046625Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3046749Z return func(*args, **kwargs) 2022-11-23T02:58:20.3046981Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3047087Z self.run_subtests( 2022-11-23T02:58:20.3047442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3047588Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3047951Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3048099Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3048476Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3048593Z output = model(*input) 2022-11-23T02:58:20.3048917Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3049055Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3049437Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3049662Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3050026Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3050148Z _lazy_init(state, module) 2022-11-23T02:58:20.3050500Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3050684Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3051029Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3051152Z return func(*args, **kwargs) 2022-11-23T02:58:20.3051594Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3051689Z p_assert( 2022-11-23T02:58:20.3052013Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3052142Z traceback.print_stack() 2022-11-23T02:58:20.3052269Z File "", line 1, in 2022-11-23T02:58:20.3052484Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3052621Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3052821Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3052966Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3053167Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3053265Z self.run() 2022-11-23T02:58:20.3053463Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3053606Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3053947Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3054077Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3054449Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3054570Z getattr(self, test_name)() 2022-11-23T02:58:20.3054920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3055017Z fn() 2022-11-23T02:58:20.3055382Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3055498Z test(self, **param_kwargs) 2022-11-23T02:58:20.3055852Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3055975Z return func(*args, **kwargs) 2022-11-23T02:58:20.3056214Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3056322Z self.run_subtests( 2022-11-23T02:58:20.3056666Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3056825Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3057193Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3057345Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3057766Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3057884Z output = model(*input) 2022-11-23T02:58:20.3058211Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3058350Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3058715Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3058938Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3059314Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3059432Z _lazy_init(state, module) 2022-11-23T02:58:20.3059785Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3059925Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3060264Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3060384Z return func(*args, **kwargs) 2022-11-23T02:58:20.3060810Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3060912Z p_assert( 2022-11-23T02:58:20.3061248Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3061374Z traceback.print_stack() 2022-11-23T02:58:20.3061609Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3061847Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3061973Z File "", line 1, in 2022-11-23T02:58:20.3062182Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3062309Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3062502Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3062645Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3062861Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3062956Z self.run() 2022-11-23T02:58:20.3063154Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3063295Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3063629Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3063757Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3064119Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3064240Z getattr(self, test_name)() 2022-11-23T02:58:20.3064599Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3064693Z fn() 2022-11-23T02:58:20.3065060Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3065184Z test(self, **param_kwargs) 2022-11-23T02:58:20.3065529Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3065656Z return func(*args, **kwargs) 2022-11-23T02:58:20.3065897Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3066004Z self.run_subtests( 2022-11-23T02:58:20.3066358Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3066518Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3066883Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3067024Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3067396Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3067518Z output = model(*input) 2022-11-23T02:58:20.3067847Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3068036Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3068429Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3068603Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3069193Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3069324Z _lazy_init(state, module) 2022-11-23T02:58:20.3069673Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3069817Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3070243Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3070363Z return func(*args, **kwargs) 2022-11-23T02:58:20.3070750Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3070846Z p_assert( 2022-11-23T02:58:20.3071184Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3071308Z traceback.print_stack() 2022-11-23T02:58:20.3071420Z File "", line 1, in 2022-11-23T02:58:20.3071624Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3071759Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3071965Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3072112Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3072325Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3072423Z self.run() 2022-11-23T02:58:20.3072613Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3072764Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3073107Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3073231Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3073598Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3073717Z getattr(self, test_name)() 2022-11-23T02:58:20.3074072Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3074160Z fn() 2022-11-23T02:58:20.3074512Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3074634Z test(self, **param_kwargs) 2022-11-23T02:58:20.3074994Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3075114Z return func(*args, **kwargs) 2022-11-23T02:58:20.3075352Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3075458Z self.run_subtests( 2022-11-23T02:58:20.3075807Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3075967Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3076319Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3076468Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3076844Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3076961Z output = model(*input) 2022-11-23T02:58:20.3077282Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3077485Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3077875Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3078046Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3078401Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3078519Z _lazy_init(state, module) 2022-11-23T02:58:20.3078872Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3079058Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3079397Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3079515Z return func(*args, **kwargs) 2022-11-23T02:58:20.3079900Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3080002Z p_assert( 2022-11-23T02:58:20.3080327Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3080450Z traceback.print_stack() 2022-11-23T02:58:20.3080688Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3080920Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3081048Z File "", line 1, in 2022-11-23T02:58:20.3081260Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3081405Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3081606Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3081741Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3081954Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3082054Z self.run() 2022-11-23T02:58:20.3082256Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3082399Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3082741Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3082871Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3083219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3083341Z getattr(self, test_name)() 2022-11-23T02:58:20.3083707Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3083802Z fn() 2022-11-23T02:58:20.3084169Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3084286Z test(self, **param_kwargs) 2022-11-23T02:58:20.3084641Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3084761Z return func(*args, **kwargs) 2022-11-23T02:58:20.3084984Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3085092Z self.run_subtests( 2022-11-23T02:58:20.3085447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3085608Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3085980Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3086130Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3086603Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3086728Z output = model(*input) 2022-11-23T02:58:20.3087044Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3087183Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3087566Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3087741Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3088115Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3088283Z _lazy_init(state, module) 2022-11-23T02:58:20.3088641Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3088781Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3089110Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3089239Z return func(*args, **kwargs) 2022-11-23T02:58:20.3089622Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3089718Z p_assert( 2022-11-23T02:58:20.3090057Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3090187Z traceback.print_stack() 2022-11-23T02:58:20.3090310Z File "", line 1, in 2022-11-23T02:58:20.3090517Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3090647Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3090851Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3091002Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3091213Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3091317Z self.run() 2022-11-23T02:58:20.3091520Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3091665Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3091993Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3092124Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3092489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3092611Z getattr(self, test_name)() 2022-11-23T02:58:20.3092974Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3093069Z fn() 2022-11-23T02:58:20.3093441Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3093558Z test(self, **param_kwargs) 2022-11-23T02:58:20.3093903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3094027Z return func(*args, **kwargs) 2022-11-23T02:58:20.3094261Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3094373Z self.run_subtests( 2022-11-23T02:58:20.3094727Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3094883Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3095247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3095397Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3095806Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3095927Z output = model(*input) 2022-11-23T02:58:20.3096253Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3096395Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3096775Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3096946Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3097318Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3097494Z _lazy_init(state, module) 2022-11-23T02:58:20.3097837Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3097972Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3098311Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3098431Z return func(*args, **kwargs) 2022-11-23T02:58:20.3098816Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3098915Z p_assert( 2022-11-23T02:58:20.3099252Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3099374Z traceback.print_stack() 2022-11-23T02:58:20.3099599Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3099841Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3099963Z File "", line 1, in 2022-11-23T02:58:20.3100169Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3100308Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3100507Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3100653Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3100862Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3100948Z self.run() 2022-11-23T02:58:20.3101151Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3101289Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3101635Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3101766Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3102133Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3102252Z getattr(self, test_name)() 2022-11-23T02:58:20.3102618Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3102700Z fn() 2022-11-23T02:58:20.3103066Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3103184Z test(self, **param_kwargs) 2022-11-23T02:58:20.3103539Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3103657Z return func(*args, **kwargs) 2022-11-23T02:58:20.3103891Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3103999Z self.run_subtests( 2022-11-23T02:58:20.3104345Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3104508Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3104920Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3105076Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3105454Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3105569Z output = model(*input) 2022-11-23T02:58:20.3105898Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3106038Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3106409Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3106619Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3106988Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3107107Z _lazy_init(state, module) 2022-11-23T02:58:20.3107465Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3107608Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3107946Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3108071Z return func(*args, **kwargs) 2022-11-23T02:58:20.3108453Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3108538Z p_assert( 2022-11-23T02:58:20.3108880Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3109177Z traceback.print_stack() 2022-11-23T02:58:20.3109314Z File "", line 1, in 2022-11-23T02:58:20.3109524Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3109670Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3109871Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3110006Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3110216Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3110316Z self.run() 2022-11-23T02:58:20.3110515Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3110658Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3111005Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3111142Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3111511Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3111617Z getattr(self, test_name)() 2022-11-23T02:58:20.3111982Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3112078Z fn() 2022-11-23T02:58:20.3112446Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3112564Z test(self, **param_kwargs) 2022-11-23T02:58:20.3112922Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3113047Z return func(*args, **kwargs) 2022-11-23T02:58:20.3113282Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3113382Z self.run_subtests( 2022-11-23T02:58:20.3113734Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3113893Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3114325Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3114483Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3114862Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3114977Z output = model(*input) 2022-11-23T02:58:20.3115305Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3115429Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3115809Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3116044Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3116416Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3116540Z _lazy_init(state, module) 2022-11-23T02:58:20.3116895Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3117034Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3117371Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3117479Z return func(*args, **kwargs) 2022-11-23T02:58:20.3117856Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3117955Z p_assert( 2022-11-23T02:58:20.3118285Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3118408Z traceback.print_stack() 2022-11-23T02:58:20.3118642Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3118875Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3118999Z File "", line 1, in 2022-11-23T02:58:20.3119194Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3119330Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3119524Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3119671Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3119885Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3119987Z self.run() 2022-11-23T02:58:20.3120186Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3120321Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3120661Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3120787Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3121155Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3121277Z getattr(self, test_name)() 2022-11-23T02:58:20.3121636Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3121730Z fn() 2022-11-23T02:58:20.3122102Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3122209Z test(self, **param_kwargs) 2022-11-23T02:58:20.3122568Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3122695Z return func(*args, **kwargs) 2022-11-23T02:58:20.3122930Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3123036Z self.run_subtests( 2022-11-23T02:58:20.3123442Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3123607Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3123975Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3124113Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3124489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3124602Z output = model(*input) 2022-11-23T02:58:20.3124927Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3125112Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3125490Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3125669Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3126036Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3126141Z _lazy_init(state, module) 2022-11-23T02:58:20.3126497Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3126634Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3126975Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3127098Z return func(*args, **kwargs) 2022-11-23T02:58:20.3127480Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3127582Z p_assert( 2022-11-23T02:58:20.3127922Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3128035Z traceback.print_stack() 2022-11-23T02:58:20.3128163Z File "", line 1, in 2022-11-23T02:58:20.3128367Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3128504Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3128706Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3128857Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3129066Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3129164Z self.run() 2022-11-23T02:58:20.3129353Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3129496Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3129834Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3129959Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3130323Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3130440Z getattr(self, test_name)() 2022-11-23T02:58:20.3130799Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3130880Z fn() 2022-11-23T02:58:20.3131247Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3131367Z test(self, **param_kwargs) 2022-11-23T02:58:20.3131725Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3131851Z return func(*args, **kwargs) 2022-11-23T02:58:20.3132092Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3132200Z self.run_subtests( 2022-11-23T02:58:20.3132597Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3132749Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3133121Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3133266Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3133639Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3133750Z output = model(*input) 2022-11-23T02:58:20.3134075Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3134264Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3134647Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3134811Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3135186Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3135302Z _lazy_init(state, module) 2022-11-23T02:58:20.3135656Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3135797Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3136135Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3136256Z return func(*args, **kwargs) 2022-11-23T02:58:20.3136642Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3136726Z p_assert( 2022-11-23T02:58:20.3137067Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3137196Z traceback.print_stack() 2022-11-23T02:58:20.3137437Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3137673Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3137800Z File "", line 1, in 2022-11-23T02:58:20.3138013Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3138153Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3138340Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3138491Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3138703Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3138799Z self.run() 2022-11-23T02:58:20.3138998Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3139142Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3139485Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3139611Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3139961Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3140086Z getattr(self, test_name)() 2022-11-23T02:58:20.3140447Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3140540Z fn() 2022-11-23T02:58:20.3140903Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3141026Z test(self, **param_kwargs) 2022-11-23T02:58:20.3141385Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3141541Z return func(*args, **kwargs) 2022-11-23T02:58:20.3141786Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3141895Z self.run_subtests( 2022-11-23T02:58:20.3142245Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3142403Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3142771Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3142920Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3143298Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3143464Z output = model(*input) 2022-11-23T02:58:20.3143782Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3143926Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3144304Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3144475Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3144839Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3144957Z _lazy_init(state, module) 2022-11-23T02:58:20.3145313Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3145454Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3145787Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3145908Z return func(*args, **kwargs) 2022-11-23T02:58:20.3146290Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3146390Z p_assert( 2022-11-23T02:58:20.3146727Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3146846Z traceback.print_stack() 2022-11-23T02:58:20.3146967Z File "", line 1, in 2022-11-23T02:58:20.3147163Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3147300Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3147500Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3147648Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3147865Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3147967Z self.run() 2022-11-23T02:58:20.3148167Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3148310Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3148643Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3148767Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3149309Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3149437Z getattr(self, test_name)() 2022-11-23T02:58:20.3149802Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3149895Z fn() 2022-11-23T02:58:20.3150262Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3150388Z test(self, **param_kwargs) 2022-11-23T02:58:20.3150765Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3150965Z return func(*args, **kwargs) 2022-11-23T02:58:20.3151215Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3151327Z self.run_subtests( 2022-11-23T02:58:20.3151692Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3151850Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3152214Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3152360Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3152724Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3152901Z output = model(*input) 2022-11-23T02:58:20.3153229Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3153372Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3153755Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3153927Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3154296Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3154414Z _lazy_init(state, module) 2022-11-23T02:58:20.3154754Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3154888Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3155231Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3155348Z return func(*args, **kwargs) 2022-11-23T02:58:20.3155731Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3155831Z p_assert( 2022-11-23T02:58:20.3156173Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3156294Z traceback.print_stack() 2022-11-23T02:58:20.3156520Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3156753Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3156881Z File "", line 1, in 2022-11-23T02:58:20.3157090Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3157231Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3157423Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3157568Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3157769Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3157866Z self.run() 2022-11-23T02:58:20.3158061Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3158199Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3158538Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3158668Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3159028Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3159148Z getattr(self, test_name)() 2022-11-23T02:58:20.3159499Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3159593Z fn() 2022-11-23T02:58:20.3159956Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3160119Z test(self, **param_kwargs) 2022-11-23T02:58:20.3160489Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3160608Z return func(*args, **kwargs) 2022-11-23T02:58:20.3160845Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3160952Z self.run_subtests( 2022-11-23T02:58:20.3161292Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3161448Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3161882Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3162032Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3162410Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3162525Z output = model(*input) 2022-11-23T02:58:20.3162851Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3162989Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3163354Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3163528Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3163896Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3164018Z _lazy_init(state, module) 2022-11-23T02:58:20.3164372Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3164511Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3164854Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3164975Z return func(*args, **kwargs) 2022-11-23T02:58:20.3165344Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3165446Z p_assert( 2022-11-23T02:58:20.3165786Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3165906Z traceback.print_stack() 2022-11-23T02:58:20.3166033Z File "", line 1, in 2022-11-23T02:58:20.3166239Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3166381Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3166584Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3166719Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3166930Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3167029Z self.run() 2022-11-23T02:58:20.3167228Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3167369Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3167711Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3167839Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3168189Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3168310Z getattr(self, test_name)() 2022-11-23T02:58:20.3168679Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3168774Z fn() 2022-11-23T02:58:20.3169137Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3169302Z test(self, **param_kwargs) 2022-11-23T02:58:20.3169672Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3169796Z return func(*args, **kwargs) 2022-11-23T02:58:20.3170022Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3170129Z self.run_subtests( 2022-11-23T02:58:20.3170481Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3170641Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3171056Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3171201Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3171576Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3171691Z output = model(*input) 2022-11-23T02:58:20.3172008Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3172144Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3172524Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3172697Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3173061Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3173178Z _lazy_init(state, module) 2022-11-23T02:58:20.3173529Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3173670Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3173999Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3174125Z return func(*args, **kwargs) 2022-11-23T02:58:20.3174503Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3174599Z p_assert( 2022-11-23T02:58:20.3174932Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3175051Z traceback.print_stack() 2022-11-23T02:58:20.3175284Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3175519Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3175636Z File "", line 1, in 2022-11-23T02:58:20.3175843Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3175984Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3176186Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3176330Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3176538Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3176639Z self.run() 2022-11-23T02:58:20.3176836Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3176966Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3177304Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3177431Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3177797Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3177917Z getattr(self, test_name)() 2022-11-23T02:58:20.3178327Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3178427Z fn() 2022-11-23T02:58:20.3178783Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3178898Z test(self, **param_kwargs) 2022-11-23T02:58:20.3179255Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3179375Z return func(*args, **kwargs) 2022-11-23T02:58:20.3179614Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3179723Z self.run_subtests( 2022-11-23T02:58:20.3180129Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3180286Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3180644Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3180794Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3181172Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3181285Z output = model(*input) 2022-11-23T02:58:20.3181607Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3181743Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3182126Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3182306Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3182674Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3182779Z _lazy_init(state, module) 2022-11-23T02:58:20.3183135Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3183278Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3183619Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3183737Z return func(*args, **kwargs) 2022-11-23T02:58:20.3184119Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3184220Z p_assert( 2022-11-23T02:58:20.3184544Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3184674Z traceback.print_stack() 2022-11-23T02:58:20.3184799Z File "", line 1, in 2022-11-23T02:58:20.3185008Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3185153Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3185352Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3185496Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3185706Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3185791Z self.run() 2022-11-23T02:58:20.3185991Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3186128Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3186462Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3186594Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3186960Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3187074Z getattr(self, test_name)() 2022-11-23T02:58:20.3187483Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3187569Z fn() 2022-11-23T02:58:20.3187936Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3188051Z test(self, **param_kwargs) 2022-11-23T02:58:20.3188402Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3188521Z return func(*args, **kwargs) 2022-11-23T02:58:20.3188760Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3188871Z self.run_subtests( 2022-11-23T02:58:20.3189529Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3189677Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3190048Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3190199Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3190572Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3190685Z output = model(*input) 2022-11-23T02:58:20.3191009Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3191146Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3191528Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3191694Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3192061Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3192174Z _lazy_init(state, module) 2022-11-23T02:58:20.3192530Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3192673Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3193012Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3193133Z return func(*args, **kwargs) 2022-11-23T02:58:20.3193509Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3193594Z p_assert( 2022-11-23T02:58:20.3193925Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3194050Z traceback.print_stack() 2022-11-23T02:58:20.3194292Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3194526Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3194652Z File "", line 1, in 2022-11-23T02:58:20.3194860Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3194988Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3195183Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3195325Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3195533Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3195633Z self.run() 2022-11-23T02:58:20.3195835Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3195980Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3196324Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3196440Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3196873Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3197005Z getattr(self, test_name)() 2022-11-23T02:58:20.3197372Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3197465Z fn() 2022-11-23T02:58:20.3197828Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3197942Z test(self, **param_kwargs) 2022-11-23T02:58:20.3198295Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3198467Z return func(*args, **kwargs) 2022-11-23T02:58:20.3198705Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3198811Z self.run_subtests( 2022-11-23T02:58:20.3199169Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3199329Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3199693Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3199844Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3200219Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3200321Z output = model(*input) 2022-11-23T02:58:20.3200644Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3200785Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3201164Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3201339Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3201711Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3201828Z _lazy_init(state, module) 2022-11-23T02:58:20.3202180Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3202307Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3202645Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3202763Z return func(*args, **kwargs) 2022-11-23T02:58:20.3203140Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3203244Z p_assert( 2022-11-23T02:58:20.3203579Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3203704Z traceback.print_stack() 2022-11-23T02:58:20.3203825Z File "", line 1, in 2022-11-23T02:58:20.3204021Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main 2022-11-23T02:58:20.3204157Z exitcode = _main(fd, parent_sentinel) 2022-11-23T02:58:20.3204357Z File "/opt/conda/lib/python3.10/multiprocessing/spawn.py", line 129, in _main 2022-11-23T02:58:20.3204506Z return self._bootstrap(parent_sentinel) 2022-11-23T02:58:20.3204720Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap 2022-11-23T02:58:20.3204818Z self.run() 2022-11-23T02:58:20.3205016Z File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run 2022-11-23T02:58:20.3205151Z self._target(*self._args, **self._kwargs) 2022-11-23T02:58:20.3205497Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 790, in _run 2022-11-23T02:58:20.3205621Z self.run_test(test_name, pipe) 2022-11-23T02:58:20.3206029Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 656, in run_test 2022-11-23T02:58:20.3206154Z getattr(self, test_name)() 2022-11-23T02:58:20.3206521Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 534, in wrapper 2022-11-23T02:58:20.3206612Z fn() 2022-11-23T02:58:20.3206978Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 247, in instantiated_test 2022-11-23T02:58:20.3207083Z test(self, **param_kwargs) 2022-11-23T02:58:20.3207440Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 166, in wrapper 2022-11-23T02:58:20.3207608Z return func(*args, **kwargs) 2022-11-23T02:58:20.3207848Z File "/var/lib/jenkins/workspace/test/distributed/fsdp/test_fsdp_core.py", line 171, in test_transformer 2022-11-23T02:58:20.3207956Z self.run_subtests( 2022-11-23T02:58:20.3208315Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 754, in run_subtests 2022-11-23T02:58:20.3208476Z test_fn(*test_args, **test_kwargs, **subtest_kwargs) 2022-11-23T02:58:20.3208839Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 1008, in _test_fsdp_parity 2022-11-23T02:58:20.3208976Z fsdp_loss = self._train_for_several_steps( 2022-11-23T02:58:20.3209351Z File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py", line 827, in _train_for_several_steps 2022-11-23T02:58:20.3209463Z output = model(*input) 2022-11-23T02:58:20.3209786Z File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1427, in _call_impl 2022-11-23T02:58:20.3209926Z return forward_call(*input, **kwargs) 2022-11-23T02:58:20.3210307Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 685, in forward 2022-11-23T02:58:20.3210480Z args, kwargs = _root_pre_forward(self, self, *args, **kwargs) 2022-11-23T02:58:20.3210846Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 334, in _root_pre_forward 2022-11-23T02:58:20.3210951Z _lazy_init(state, module) 2022-11-23T02:58:20.3211300Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_runtime_utils.py", line 62, in _lazy_init 2022-11-23T02:58:20.3211440Z handle.init_flat_param_attributes() 2022-11-23T02:58:20.3211777Z File "/opt/conda/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context 2022-11-23T02:58:20.3211898Z return func(*args, **kwargs) 2022-11-23T02:58:20.3212274Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/flat_param.py", line 711, in init_flat_param_attributes 2022-11-23T02:58:20.3212381Z p_assert( 2022-11-23T02:58:20.3212722Z File "/opt/conda/lib/python3.10/site-packages/torch/distributed/fsdp/_utils.py", line 116, in p_assert 2022-11-23T02:58:20.3212835Z traceback.print_stack() 2022-11-23T02:58:20.3213069Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3213302Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3213532Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3213763Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3213990Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3214213Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3215050Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.3215820Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.3216055Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3216272Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3216507Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3216794Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3217022Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3217254Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3217483Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3217714Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3217939Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3218164Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3218378Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3218608Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3219368Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.3220121Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.3220348Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3220578Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3220808Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3221041Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3221265Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3221491Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3221706Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3221933Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3222158Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3222384Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3222610Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3222833Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3223640Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.3224387Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.3224617Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3224842Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3225178Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3225408Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3225639Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3225862Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3226089Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3226317Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3226543Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3226768Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3226980Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3227207Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3227963Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.3228711Z [W python_variable.cpp:318] Warning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (function decref) 2022-11-23T02:58:20.3229107Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3229349Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3229585Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3229817Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3230051Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3230279Z INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration. 2022-11-23T02:58:20.3230375Z dist init r=0, world=2 2022-11-23T02:58:20.3230710Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.3231029Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.3231344Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.3231657Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.3232065Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.3232395Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.3232705Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.3233014Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.3233374Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.3233677Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.3233980Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.3234267Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:0 after the FSDP constructor. 2022-11-23T02:58:20.3234373Z dist init r=1, world=2 2022-11-23T02:58:20.3234688Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.3235000Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.3235303Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.3235607Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.3235909Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.3236210Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.3236511Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.3236826Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.3237129Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.3237420Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.3237724Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.3238023Z Expects the `FlatParameter` to be offloaded to CPU since CPU offloading is enabled. You may be accidentally moving the model to cuda:1 after the FSDP constructor. 2022-11-23T02:58:20.3238121Z ok (10.521s) 2022-11-23T02:58:20.3238146Z 2022-11-23T02:58:20.3238431Z ---------------------------------------------------------------------- 2022-11-23T02:58:20.3238546Z Ran 59 tests in 571.626s 2022-11-23T02:58:20.3238612Z 2022-11-23T02:58:20.3238724Z OK (skipped=5) 2022-11-23T02:58:20.3238743Z 2022-11-23T02:58:20.3238861Z Generating XML reports... 2022-11-23T02:58:20.3239275Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestHooks-20221123024847.xml 2022-11-23T02:58:20.3239680Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestNoGrad-20221123024847.xml 2022-11-23T02:58:20.3240073Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParamInit-20221123024847.xml 2022-11-23T02:58:20.3240508Z Generated XML report: test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParityWithDDP-20221123024847.xml 2022-11-23T02:58:20.3240574Z 2022-11-23T02:58:20.3241067Z ##[endgroup] 2022-11-23T02:58:20.3241528Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_core (/var/lib/jenkins/workspace/test/test-reports/distributed-fsdp-test_fsdp_core_dav8tr5u) 2022-11-23T02:58:20.3241549Z 2022-11-23T02:58:20.3241618Z 2022-11-23T02:58:20.3241720Z real 96m57.277s 2022-11-23T02:58:20.3241822Z user 153m9.529s 2022-11-23T02:58:20.3241922Z sys 87m53.955s 2022-11-23T02:58:20.3242031Z + assert_git_not_dirty 2022-11-23T02:58:20.3242259Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 != *rocm* ]] 2022-11-23T02:58:20.3242489Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 != *xla* ]] 2022-11-23T02:58:20.3242644Z ++ git status --porcelain 2022-11-23T02:58:21.3674844Z + git_status= 2022-11-23T02:58:21.3675272Z + [[ -n '' ]] 2022-11-23T02:58:21.3675682Z + [[ linux-bionic-cuda11.6-py3.10-gcc7 == *cuda* ]] 2022-11-23T02:58:21.3675970Z + [[ 3 == 1 ]] 2022-11-23T02:58:21.3676199Z + [[ 3 == 1 ]] 2022-11-23T02:58:21.3745261Z Prepare all required actions 2022-11-23T02:58:21.3745684Z Getting action download info 2022-11-23T02:58:22.0317362Z Download action repository 'nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482' (SHA:3e91a01664abd3c5cd539100d10d33b9c5b68482) 2022-11-23T02:58:22.1733067Z ##[group]Run ./.github/actions/get-workflow-job-id 2022-11-23T02:58:22.1733361Z with: 2022-11-23T02:58:22.1733871Z github-token: *** 2022-11-23T02:58:22.1734138Z env: 2022-11-23T02:58:22.1734371Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:58:22.1734657Z GPU_FLAG: --gpus all 2022-11-23T02:58:22.1735028Z DOCKER_CONTAINER_ID: 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:58:22.1735392Z ##[endgroup] 2022-11-23T02:58:22.1770882Z ##[group]Run nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482 2022-11-23T02:58:22.1771220Z with: 2022-11-23T02:58:22.1771458Z shell: bash 2022-11-23T02:58:22.1771691Z timeout_minutes: 10 2022-11-23T02:58:22.1771956Z max_attempts: 5 2022-11-23T02:58:22.1772279Z retry_wait_seconds: 30 2022-11-23T02:58:22.1772789Z command: set -eux python3 -m pip install requests==2.26.0 GHA_WORKFLOW_JOB_ID=$(python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}") echo "job-id=${GHA_WORKFLOW_JOB_ID}" >> "${GITHUB_OUTPUT}" 2022-11-23T02:58:22.1773322Z polling_interval_seconds: 1 2022-11-23T02:58:22.1773611Z warning_on_retry: true 2022-11-23T02:58:22.1773890Z continue_on_error: false 2022-11-23T02:58:22.1774125Z env: 2022-11-23T02:58:22.1774380Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:58:22.1774662Z GPU_FLAG: --gpus all 2022-11-23T02:58:22.1775015Z DOCKER_CONTAINER_ID: 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:58:22.1775537Z GITHUB_TOKEN: *** 2022-11-23T02:58:22.1775802Z ##[endgroup] 2022-11-23T02:58:22.2524162Z + python3 -m pip install requests==2.26.0 2022-11-23T02:58:22.5350798Z Defaulting to user installation because normal site-packages is not writeable 2022-11-23T02:58:22.6762493Z Collecting requests==2.26.0 2022-11-23T02:58:22.6999406Z Downloading requests-2.26.0-py2.py3-none-any.whl (62 kB) 2022-11-23T02:58:22.7619704Z Collecting idna<4,>=2.5; python_version >= "3" 2022-11-23T02:58:22.7745474Z Downloading idna-3.4-py3-none-any.whl (61 kB) 2022-11-23T02:58:22.9541569Z Collecting charset-normalizer~=2.0.0; python_version >= "3" 2022-11-23T02:58:22.9585524Z Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB) 2022-11-23T02:58:23.0623391Z Collecting urllib3<1.27,>=1.21.1 2022-11-23T02:58:23.0668071Z Downloading urllib3-1.26.12-py2.py3-none-any.whl (140 kB) 2022-11-23T02:58:23.1552960Z Collecting certifi>=2017.4.17 2022-11-23T02:58:23.1602309Z Downloading certifi-2022.9.24-py3-none-any.whl (161 kB) 2022-11-23T02:58:23.2500083Z Installing collected packages: idna, charset-normalizer, urllib3, certifi, requests 2022-11-23T02:58:23.3632530Z WARNING: The script normalizer is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-11-23T02:58:23.3633199Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-11-23T02:58:23.5059086Z Successfully installed certifi-2022.9.24 charset-normalizer-2.0.12 idna-3.4 requests-2.26.0 urllib3-1.26.12 2022-11-23T02:58:23.5543433Z ++ python3 .github/scripts/get_workflow_job_id.py 3528293562 i-08478b31fddc5d09b 2022-11-23T02:58:25.8286821Z + GHA_WORKFLOW_JOB_ID=9655200299 2022-11-23T02:58:25.8287331Z + echo job-id=9655200299 2022-11-23T02:58:26.2524209Z Command completed after 1 attempt(s). 2022-11-23T02:58:26.2661333Z ##[group]Run kill "$MONITOR_SCRIPT_PID" 2022-11-23T02:58:26.2661702Z kill "$MONITOR_SCRIPT_PID" 2022-11-23T02:58:26.2675365Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T02:58:26.2675677Z env: 2022-11-23T02:58:26.2675923Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:58:26.2676176Z GPU_FLAG: --gpus all 2022-11-23T02:58:26.2676537Z DOCKER_CONTAINER_ID: 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:58:26.2676912Z MONITOR_SCRIPT_PID: 63621 2022-11-23T02:58:26.2677154Z ##[endgroup] 2022-11-23T02:58:26.2776274Z Prepare all required actions 2022-11-23T02:58:26.2776653Z Getting action download info 2022-11-23T02:58:27.0701088Z Download action repository 'actions/upload-artifact@v3' (SHA:83fd05a356d7e2593de66fc9913b3002723633cb) 2022-11-23T02:58:27.2315040Z ##[group]Run ./.github/actions/upload-test-artifacts 2022-11-23T02:58:27.2315344Z with: 2022-11-23T02:58:27.2315704Z file-suffix: test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655200299 2022-11-23T02:58:27.2316038Z env: 2022-11-23T02:58:27.2316275Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:58:27.2316539Z GPU_FLAG: --gpus all 2022-11-23T02:58:27.2316878Z DOCKER_CONTAINER_ID: 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:58:27.2317223Z ##[endgroup] 2022-11-23T02:58:27.2346966Z ##[group]Run # Remove any previous test jsons if they exist 2022-11-23T02:58:27.2347349Z # Remove any previous test jsons if they exist 2022-11-23T02:58:27.2347674Z rm -f test-jsons-*.zip 2022-11-23T02:58:27.2348044Z zip -r "test-jsons-${FILE_SUFFIX}.zip" test -i '*.json' 2022-11-23T02:58:27.2360445Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T02:58:27.2360743Z env: 2022-11-23T02:58:27.2360985Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:58:27.2361240Z GPU_FLAG: --gpus all 2022-11-23T02:58:27.2361601Z DOCKER_CONTAINER_ID: 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:58:27.2362069Z FILE_SUFFIX: test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655200299 2022-11-23T02:58:27.2362411Z ##[endgroup] 2022-11-23T02:58:27.2521282Z adding: test/allowlist_for_publicAPI.json (deflated 79%) 2022-11-23T02:58:27.2558367Z adding: test/benchmark_utils/callgrind_artifacts.json (deflated 92%) 2022-11-23T02:58:27.2565868Z adding: test/profiler/profiler_utils_mock_events.json (deflated 87%) 2022-11-23T02:58:27.2566993Z adding: test/.pytorch-slow-tests.json (deflated 73%) 2022-11-23T02:58:27.2578522Z adding: test/.pytorch-disabled-tests.json (deflated 86%) 2022-11-23T02:58:27.2602504Z ##[group]Run # Remove any previous test reports if they exist 2022-11-23T02:58:27.2602906Z # Remove any previous test reports if they exist 2022-11-23T02:58:27.2603241Z rm -f test-reports-*.zip 2022-11-23T02:58:27.2603601Z zip -r "test-reports-${FILE_SUFFIX}.zip" test -i '*.xml' -i '*.csv' 2022-11-23T02:58:27.2615647Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T02:58:27.2615954Z env: 2022-11-23T02:58:27.2616200Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:58:27.2616455Z GPU_FLAG: --gpus all 2022-11-23T02:58:27.2616818Z DOCKER_CONTAINER_ID: 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:58:27.2617290Z FILE_SUFFIX: test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655200299 2022-11-23T02:58:27.2617653Z ##[endgroup] 2022-11-23T02:58:27.2768590Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012131.xml (deflated 41%) 2022-11-23T02:58:27.2769689Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012141.xml (deflated 42%) 2022-11-23T02:58:27.2770497Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012144.xml (deflated 41%) 2022-11-23T02:58:27.2771304Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012151.xml (deflated 43%) 2022-11-23T02:58:27.2772123Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012155.xml (deflated 41%) 2022-11-23T02:58:27.2772895Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012204.xml (deflated 41%) 2022-11-23T02:58:27.2773784Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012212.xml (deflated 41%) 2022-11-23T02:58:27.2774696Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012222.xml (deflated 40%) 2022-11-23T02:58:27.2775510Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012231.xml (deflated 40%) 2022-11-23T02:58:27.2776297Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012240.xml (deflated 39%) 2022-11-23T02:58:27.2777086Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012249.xml (deflated 39%) 2022-11-23T02:58:27.2777881Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012258.xml (deflated 40%) 2022-11-23T02:58:27.2778673Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012308.xml (deflated 40%) 2022-11-23T02:58:27.2779457Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012316.xml (deflated 42%) 2022-11-23T02:58:27.2780247Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012321.xml (deflated 42%) 2022-11-23T02:58:27.2781042Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012327.xml (deflated 42%) 2022-11-23T02:58:27.2781843Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012332.xml (deflated 42%) 2022-11-23T02:58:27.2782631Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012340.xml (deflated 42%) 2022-11-23T02:58:27.2783402Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012343.xml (deflated 45%) 2022-11-23T02:58:27.2784203Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012345.xml (deflated 47%) 2022-11-23T02:58:27.2784997Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012347.xml (deflated 48%) 2022-11-23T02:58:27.2785785Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012350.xml (deflated 45%) 2022-11-23T02:58:27.2786557Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012352.xml (deflated 40%) 2022-11-23T02:58:27.2787340Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012401.xml (deflated 43%) 2022-11-23T02:58:27.2788123Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012404.xml (deflated 44%) 2022-11-23T02:58:27.2788913Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012406.xml (deflated 43%) 2022-11-23T02:58:27.2790398Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012409.xml (deflated 44%) 2022-11-23T02:58:27.2791184Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012411.xml (deflated 43%) 2022-11-23T02:58:27.2792207Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012413.xml (deflated 40%) 2022-11-23T02:58:27.2792998Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012423.xml (deflated 41%) 2022-11-23T02:58:27.2793767Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012425.xml (deflated 41%) 2022-11-23T02:58:27.2794554Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012427.xml (deflated 41%) 2022-11-23T02:58:27.2795454Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012436.xml (deflated 42%) 2022-11-23T02:58:27.2796259Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012443.xml (deflated 42%) 2022-11-23T02:58:27.2797031Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012446.xml (deflated 42%) 2022-11-23T02:58:27.2797819Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012448.xml (deflated 42%) 2022-11-23T02:58:27.2798603Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012450.xml (deflated 42%) 2022-11-23T02:58:27.2799392Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012453.xml (deflated 40%) 2022-11-23T02:58:27.2800170Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012502.xml (deflated 40%) 2022-11-23T02:58:27.2800961Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012513.xml (deflated 43%) 2022-11-23T02:58:27.2801743Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012515.xml (deflated 41%) 2022-11-23T02:58:27.2802523Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012517.xml (deflated 41%) 2022-11-23T02:58:27.2803324Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012520.xml (deflated 41%) 2022-11-23T02:58:27.2804096Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012522.xml (deflated 41%) 2022-11-23T02:58:27.2804892Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012524.xml (deflated 41%) 2022-11-23T02:58:27.2805665Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012527.xml (deflated 41%) 2022-11-23T02:58:27.2806447Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012529.xml (deflated 41%) 2022-11-23T02:58:27.2807213Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012532.xml (deflated 41%) 2022-11-23T02:58:27.2807998Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012534.xml (deflated 41%) 2022-11-23T02:58:27.2808775Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012536.xml (deflated 40%) 2022-11-23T02:58:27.2809670Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012543.xml (deflated 41%) 2022-11-23T02:58:27.2810430Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012546.xml (deflated 41%) 2022-11-23T02:58:27.2811207Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012548.xml (deflated 41%) 2022-11-23T02:58:27.2811992Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012550.xml (deflated 41%) 2022-11-23T02:58:27.2812776Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012557.xml (deflated 40%) 2022-11-23T02:58:27.2813540Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012606.xml (deflated 41%) 2022-11-23T02:58:27.2814382Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012616.xml (deflated 41%) 2022-11-23T02:58:27.2815176Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012625.xml (deflated 41%) 2022-11-23T02:58:27.2815959Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012634.xml (deflated 42%) 2022-11-23T02:58:27.2816723Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012641.xml (deflated 42%) 2022-11-23T02:58:27.2817502Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012648.xml (deflated 42%) 2022-11-23T02:58:27.2818277Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012654.xml (deflated 42%) 2022-11-23T02:58:27.2819056Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012701.xml (deflated 41%) 2022-11-23T02:58:27.2819830Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012710.xml (deflated 41%) 2022-11-23T02:58:27.2820616Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012720.xml (deflated 41%) 2022-11-23T02:58:27.2821397Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012729.xml (deflated 41%) 2022-11-23T02:58:27.2822173Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012739.xml (deflated 40%) 2022-11-23T02:58:27.2822940Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012748.xml (deflated 40%) 2022-11-23T02:58:27.2823760Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012756.xml (deflated 40%) 2022-11-23T02:58:27.2824555Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012806.xml (deflated 41%) 2022-11-23T02:58:27.2825336Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012815.xml (deflated 41%) 2022-11-23T02:58:27.2826110Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012824.xml (deflated 41%) 2022-11-23T02:58:27.2826881Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012826.xml (deflated 41%) 2022-11-23T02:58:27.2827661Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012829.xml (deflated 40%) 2022-11-23T02:58:27.2828441Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012831.xml (deflated 42%) 2022-11-23T02:58:27.2829935Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012834.xml (deflated 42%) 2022-11-23T02:58:27.2830729Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012836.xml (deflated 42%) 2022-11-23T02:58:27.2831510Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012838.xml (deflated 41%) 2022-11-23T02:58:27.2832291Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012841.xml (deflated 42%) 2022-11-23T02:58:27.2833071Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012843.xml (deflated 42%) 2022-11-23T02:58:27.2833837Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012846.xml (deflated 42%) 2022-11-23T02:58:27.2834721Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012848.xml (deflated 42%) 2022-11-23T02:58:27.2835527Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012850.xml (deflated 42%) 2022-11-23T02:58:27.2836308Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012853.xml (deflated 42%) 2022-11-23T02:58:27.2837072Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012855.xml (deflated 42%) 2022-11-23T02:58:27.2837854Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012857.xml (deflated 42%) 2022-11-23T02:58:27.2838633Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012900.xml (deflated 42%) 2022-11-23T02:58:27.2839422Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012902.xml (deflated 42%) 2022-11-23T02:58:27.2840190Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012905.xml (deflated 42%) 2022-11-23T02:58:27.2840968Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012907.xml (deflated 43%) 2022-11-23T02:58:27.2841744Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012909.xml (deflated 42%) 2022-11-23T02:58:27.2842522Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012912.xml (deflated 42%) 2022-11-23T02:58:27.2843288Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012914.xml (deflated 42%) 2022-11-23T02:58:27.2844075Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012917.xml (deflated 42%) 2022-11-23T02:58:27.2844865Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012919.xml (deflated 42%) 2022-11-23T02:58:27.2845647Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012921.xml (deflated 42%) 2022-11-23T02:58:27.2846415Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012924.xml (deflated 42%) 2022-11-23T02:58:27.2847197Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012926.xml (deflated 42%) 2022-11-23T02:58:27.2847975Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012929.xml (deflated 41%) 2022-11-23T02:58:27.2848751Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012939.xml (deflated 42%) 2022-11-23T02:58:27.2849632Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012945.xml (deflated 42%) 2022-11-23T02:58:27.2850450Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012948.xml (deflated 42%) 2022-11-23T02:58:27.2851229Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123012950.xml (deflated 41%) 2022-11-23T02:58:27.2852010Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013000.xml (deflated 42%) 2022-11-23T02:58:27.2852792Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013002.xml (deflated 42%) 2022-11-23T02:58:27.2853562Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013009.xml (deflated 42%) 2022-11-23T02:58:27.2854409Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013011.xml (deflated 42%) 2022-11-23T02:58:27.2855202Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013018.xml (deflated 42%) 2022-11-23T02:58:27.2856021Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013020.xml (deflated 43%) 2022-11-23T02:58:27.2856791Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013023.xml (deflated 43%) 2022-11-23T02:58:27.2857571Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013025.xml (deflated 41%) 2022-11-23T02:58:27.2858351Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013027.xml (deflated 41%) 2022-11-23T02:58:27.2859149Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013030.xml (deflated 42%) 2022-11-23T02:58:27.2859913Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013032.xml (deflated 41%) 2022-11-23T02:58:27.2860695Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013035.xml (deflated 41%) 2022-11-23T02:58:27.2861475Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013037.xml (deflated 41%) 2022-11-23T02:58:27.2862249Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013039.xml (deflated 41%) 2022-11-23T02:58:27.2863024Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013042.xml (deflated 41%) 2022-11-23T02:58:27.2863797Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013044.xml (deflated 41%) 2022-11-23T02:58:27.2864588Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013046.xml (deflated 41%) 2022-11-23T02:58:27.2865372Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013049.xml (deflated 40%) 2022-11-23T02:58:27.2866151Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013058.xml (deflated 41%) 2022-11-23T02:58:27.2866918Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013100.xml (deflated 40%) 2022-11-23T02:58:27.2867699Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013110.xml (deflated 42%) 2022-11-23T02:58:27.2868474Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013116.xml (deflated 40%) 2022-11-23T02:58:27.2869922Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013125.xml (deflated 42%) 2022-11-23T02:58:27.2870699Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013129.xml (deflated 42%) 2022-11-23T02:58:27.2871485Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013134.xml (deflated 42%) 2022-11-23T02:58:27.2872270Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013138.xml (deflated 40%) 2022-11-23T02:58:27.2873049Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013147.xml (deflated 40%) 2022-11-23T02:58:27.2873818Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013156.xml (deflated 42%) 2022-11-23T02:58:27.2874721Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013200.xml (deflated 42%) 2022-11-23T02:58:27.2875516Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013204.xml (deflated 41%) 2022-11-23T02:58:27.2876293Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013213.xml (deflated 40%) 2022-11-23T02:58:27.2877059Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013222.xml (deflated 40%) 2022-11-23T02:58:27.2877833Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013231.xml (deflated 40%) 2022-11-23T02:58:27.2878605Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013240.xml (deflated 42%) 2022-11-23T02:58:27.2879399Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013244.xml (deflated 40%) 2022-11-23T02:58:27.2880169Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013253.xml (deflated 42%) 2022-11-23T02:58:27.2880947Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013257.xml (deflated 41%) 2022-11-23T02:58:27.2881727Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013307.xml (deflated 42%) 2022-11-23T02:58:27.2882509Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013311.xml (deflated 42%) 2022-11-23T02:58:27.2883279Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013315.xml (deflated 40%) 2022-11-23T02:58:27.2884071Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013324.xml (deflated 40%) 2022-11-23T02:58:27.2884850Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013334.xml (deflated 42%) 2022-11-23T02:58:27.2885626Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013338.xml (deflated 40%) 2022-11-23T02:58:27.2886392Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013347.xml (deflated 42%) 2022-11-23T02:58:27.2887172Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013350.xml (deflated 42%) 2022-11-23T02:58:27.2887954Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013352.xml (deflated 42%) 2022-11-23T02:58:27.2888736Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013355.xml (deflated 41%) 2022-11-23T02:58:27.2889608Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013357.xml (deflated 41%) 2022-11-23T02:58:27.2890388Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013359.xml (deflated 41%) 2022-11-23T02:58:27.2891164Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013402.xml (deflated 41%) 2022-11-23T02:58:27.2891942Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013404.xml (deflated 41%) 2022-11-23T02:58:27.2892705Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013407.xml (deflated 41%) 2022-11-23T02:58:27.2893475Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013409.xml (deflated 42%) 2022-11-23T02:58:27.2894315Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013412.xml (deflated 42%) 2022-11-23T02:58:27.2895110Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013414.xml (deflated 42%) 2022-11-23T02:58:27.2895868Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013417.xml (deflated 42%) 2022-11-23T02:58:27.2896656Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013421.xml (deflated 40%) 2022-11-23T02:58:27.2897439Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013430.xml (deflated 40%) 2022-11-23T02:58:27.2898215Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013439.xml (deflated 40%) 2022-11-23T02:58:27.2898985Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013448.xml (deflated 40%) 2022-11-23T02:58:27.2899769Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013458.xml (deflated 40%) 2022-11-23T02:58:27.2900545Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013506.xml (deflated 40%) 2022-11-23T02:58:27.2901323Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013525.xml (deflated 41%) 2022-11-23T02:58:27.2902091Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013535.xml (deflated 41%) 2022-11-23T02:58:27.2902950Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013545.xml (deflated 41%) 2022-11-23T02:58:27.2903719Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013554.xml (deflated 40%) 2022-11-23T02:58:27.2904504Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013603.xml (deflated 42%) 2022-11-23T02:58:27.2905283Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013607.xml (deflated 42%) 2022-11-23T02:58:27.2906058Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013611.xml (deflated 41%) 2022-11-23T02:58:27.2906823Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013621.xml (deflated 40%) 2022-11-23T02:58:27.2907603Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013630.xml (deflated 42%) 2022-11-23T02:58:27.2908382Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013634.xml (deflated 41%) 2022-11-23T02:58:27.2909857Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013643.xml (deflated 42%) 2022-11-23T02:58:27.2910655Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013648.xml (deflated 41%) 2022-11-23T02:58:27.2911436Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013658.xml (deflated 41%) 2022-11-23T02:58:27.2912221Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013706.xml (deflated 41%) 2022-11-23T02:58:27.2913001Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013715.xml (deflated 42%) 2022-11-23T02:58:27.2913766Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013719.xml (deflated 42%) 2022-11-23T02:58:27.2914647Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013723.xml (deflated 42%) 2022-11-23T02:58:27.2915446Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013727.xml (deflated 40%) 2022-11-23T02:58:27.2916223Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013737.xml (deflated 41%) 2022-11-23T02:58:27.2916990Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013745.xml (deflated 40%) 2022-11-23T02:58:27.2917777Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013752.xml (deflated 40%) 2022-11-23T02:58:27.2918554Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013759.xml (deflated 42%) 2022-11-23T02:58:27.2919345Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013804.xml (deflated 42%) 2022-11-23T02:58:27.2920112Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013808.xml (deflated 40%) 2022-11-23T02:58:27.2920891Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013814.xml (deflated 41%) 2022-11-23T02:58:27.2921674Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013817.xml (deflated 41%) 2022-11-23T02:58:27.2922454Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013819.xml (deflated 42%) 2022-11-23T02:58:27.2923213Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013822.xml (deflated 41%) 2022-11-23T02:58:27.2924004Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013824.xml (deflated 41%) 2022-11-23T02:58:27.2924785Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013826.xml (deflated 41%) 2022-11-23T02:58:27.2925561Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013829.xml (deflated 40%) 2022-11-23T02:58:27.2926326Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013831.xml (deflated 41%) 2022-11-23T02:58:27.2927111Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013838.xml (deflated 42%) 2022-11-23T02:58:27.2927892Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013841.xml (deflated 40%) 2022-11-23T02:58:27.2928767Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013847.xml (deflated 40%) 2022-11-23T02:58:27.2929532Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013854.xml (deflated 40%) 2022-11-23T02:58:27.2930314Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013901.xml (deflated 40%) 2022-11-23T02:58:27.2931089Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013910.xml (deflated 41%) 2022-11-23T02:58:27.2931873Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013919.xml (deflated 41%) 2022-11-23T02:58:27.2932640Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013928.xml (deflated 40%) 2022-11-23T02:58:27.2933474Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013937.xml (deflated 40%) 2022-11-23T02:58:27.2934274Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123013947.xml (deflated 40%) 2022-11-23T02:58:27.2935052Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014012.xml (deflated 41%) 2022-11-23T02:58:27.2935815Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014036.xml (deflated 42%) 2022-11-23T02:58:27.2936597Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014039.xml (deflated 41%) 2022-11-23T02:58:27.2937380Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014041.xml (deflated 41%) 2022-11-23T02:58:27.2938154Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014044.xml (deflated 41%) 2022-11-23T02:58:27.2938921Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014046.xml (deflated 41%) 2022-11-23T02:58:27.2939709Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014048.xml (deflated 42%) 2022-11-23T02:58:27.2940492Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014051.xml (deflated 42%) 2022-11-23T02:58:27.2941271Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014053.xml (deflated 42%) 2022-11-23T02:58:27.2942051Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014056.xml (deflated 41%) 2022-11-23T02:58:27.2942819Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014058.xml (deflated 42%) 2022-11-23T02:58:27.2943611Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014100.xml (deflated 42%) 2022-11-23T02:58:27.2944392Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014103.xml (deflated 42%) 2022-11-23T02:58:27.2945171Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014105.xml (deflated 41%) 2022-11-23T02:58:27.2945938Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014108.xml (deflated 40%) 2022-11-23T02:58:27.2946718Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014114.xml (deflated 40%) 2022-11-23T02:58:27.2947499Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014121.xml (deflated 42%) 2022-11-23T02:58:27.2948351Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014124.xml (deflated 42%) 2022-11-23T02:58:27.2949736Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014126.xml (deflated 42%) 2022-11-23T02:58:27.2950591Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014130.xml (deflated 40%) 2022-11-23T02:58:27.2951379Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014139.xml (deflated 41%) 2022-11-23T02:58:27.2952161Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014149.xml (deflated 40%) 2022-11-23T02:58:27.2952928Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014159.xml (deflated 42%) 2022-11-23T02:58:27.2953809Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014203.xml (deflated 41%) 2022-11-23T02:58:27.2954606Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014207.xml (deflated 40%) 2022-11-23T02:58:27.2955388Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014214.xml (deflated 40%) 2022-11-23T02:58:27.2956154Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014221.xml (deflated 42%) 2022-11-23T02:58:27.2956935Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014225.xml (deflated 40%) 2022-11-23T02:58:27.2957713Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014234.xml (deflated 40%) 2022-11-23T02:58:27.2958501Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014243.xml (deflated 40%) 2022-11-23T02:58:27.2959274Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014253.xml (deflated 40%) 2022-11-23T02:58:27.2960055Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014302.xml (deflated 42%) 2022-11-23T02:58:27.2960843Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014308.xml (deflated 42%) 2022-11-23T02:58:27.2961625Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014315.xml (deflated 42%) 2022-11-23T02:58:27.2962391Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014322.xml (deflated 42%) 2022-11-23T02:58:27.2963175Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014329.xml (deflated 41%) 2022-11-23T02:58:27.2963967Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014338.xml (deflated 41%) 2022-11-23T02:58:27.2964748Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014348.xml (deflated 42%) 2022-11-23T02:58:27.2965515Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014350.xml (deflated 41%) 2022-11-23T02:58:27.2966297Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014359.xml (deflated 43%) 2022-11-23T02:58:27.2967071Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014402.xml (deflated 43%) 2022-11-23T02:58:27.2967854Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014404.xml (deflated 40%) 2022-11-23T02:58:27.2968716Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014413.xml (deflated 42%) 2022-11-23T02:58:27.2969503Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014416.xml (deflated 42%) 2022-11-23T02:58:27.2970285Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014418.xml (deflated 41%) 2022-11-23T02:58:27.2971068Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014427.xml (deflated 41%) 2022-11-23T02:58:27.2971833Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014430.xml (deflated 41%) 2022-11-23T02:58:27.2972616Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014432.xml (deflated 41%) 2022-11-23T02:58:27.2973457Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014434.xml (deflated 41%) 2022-11-23T02:58:27.2974250Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014437.xml (deflated 41%) 2022-11-23T02:58:27.2975107Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014439.xml (deflated 41%) 2022-11-23T02:58:27.2975894Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014442.xml (deflated 41%) 2022-11-23T02:58:27.2976675Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014444.xml (deflated 41%) 2022-11-23T02:58:27.2977440Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014447.xml (deflated 41%) 2022-11-23T02:58:27.2978231Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014455.xml (deflated 42%) 2022-11-23T02:58:27.2979009Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014458.xml (deflated 42%) 2022-11-23T02:58:27.2979789Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014500.xml (deflated 42%) 2022-11-23T02:58:27.2980555Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014502.xml (deflated 41%) 2022-11-23T02:58:27.2981328Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014512.xml (deflated 41%) 2022-11-23T02:58:27.2982105Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014514.xml (deflated 41%) 2022-11-23T02:58:27.2982884Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014516.xml (deflated 41%) 2022-11-23T02:58:27.2983661Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014519.xml (deflated 41%) 2022-11-23T02:58:27.2984437Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014528.xml (deflated 41%) 2022-11-23T02:58:27.2985214Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014537.xml (deflated 40%) 2022-11-23T02:58:27.2985993Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014546.xml (deflated 40%) 2022-11-23T02:58:27.2986767Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014555.xml (deflated 42%) 2022-11-23T02:58:27.2987543Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014558.xml (deflated 42%) 2022-11-23T02:58:27.2988394Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014600.xml (deflated 41%) 2022-11-23T02:58:27.2989752Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014609.xml (deflated 41%) 2022-11-23T02:58:27.2990550Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014618.xml (deflated 41%) 2022-11-23T02:58:27.2991333Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014621.xml (deflated 41%) 2022-11-23T02:58:27.2992112Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014630.xml (deflated 40%) 2022-11-23T02:58:27.2992893Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014643.xml (deflated 40%) 2022-11-23T02:58:27.2993755Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014700.xml (deflated 41%) 2022-11-23T02:58:27.2994558Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014710.xml (deflated 41%) 2022-11-23T02:58:27.2995338Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014712.xml (deflated 41%) 2022-11-23T02:58:27.2996118Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014719.xml (deflated 42%) 2022-11-23T02:58:27.2996879Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014723.xml (deflated 41%) 2022-11-23T02:58:27.2997665Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014732.xml (deflated 41%) 2022-11-23T02:58:27.2998450Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014741.xml (deflated 40%) 2022-11-23T02:58:27.2999235Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014750.xml (deflated 40%) 2022-11-23T02:58:27.3000005Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014759.xml (deflated 40%) 2022-11-23T02:58:27.3000793Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014808.xml (deflated 39%) 2022-11-23T02:58:27.3001572Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014817.xml (deflated 40%) 2022-11-23T02:58:27.3002359Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014826.xml (deflated 40%) 2022-11-23T02:58:27.3003131Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014836.xml (deflated 40%) 2022-11-23T02:58:27.3003912Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014844.xml (deflated 42%) 2022-11-23T02:58:27.3004694Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014848.xml (deflated 41%) 2022-11-23T02:58:27.3005474Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014855.xml (deflated 42%) 2022-11-23T02:58:27.3006241Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014900.xml (deflated 42%) 2022-11-23T02:58:27.3007022Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014908.xml (deflated 41%) 2022-11-23T02:58:27.3007807Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014911.xml (deflated 45%) 2022-11-23T02:58:27.3008675Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014913.xml (deflated 46%) 2022-11-23T02:58:27.3009440Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014915.xml (deflated 48%) 2022-11-23T02:58:27.3010226Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014918.xml (deflated 45%) 2022-11-23T02:58:27.3011008Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014920.xml (deflated 40%) 2022-11-23T02:58:27.3011783Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014930.xml (deflated 43%) 2022-11-23T02:58:27.3012546Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014932.xml (deflated 43%) 2022-11-23T02:58:27.3013408Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014934.xml (deflated 43%) 2022-11-23T02:58:27.3014205Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014937.xml (deflated 43%) 2022-11-23T02:58:27.3014984Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014939.xml (deflated 43%) 2022-11-23T02:58:27.3015762Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014942.xml (deflated 40%) 2022-11-23T02:58:27.3016526Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014951.xml (deflated 41%) 2022-11-23T02:58:27.3017316Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014953.xml (deflated 41%) 2022-11-23T02:58:27.3018102Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123014956.xml (deflated 40%) 2022-11-23T02:58:27.3018881Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015005.xml (deflated 42%) 2022-11-23T02:58:27.3019643Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015012.xml (deflated 43%) 2022-11-23T02:58:27.3020422Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015014.xml (deflated 42%) 2022-11-23T02:58:27.3021200Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015016.xml (deflated 42%) 2022-11-23T02:58:27.3021980Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015019.xml (deflated 42%) 2022-11-23T02:58:27.3022760Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015021.xml (deflated 41%) 2022-11-23T02:58:27.3023539Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015031.xml (deflated 40%) 2022-11-23T02:58:27.3024322Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015041.xml (deflated 43%) 2022-11-23T02:58:27.3025107Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015044.xml (deflated 41%) 2022-11-23T02:58:27.3025870Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015046.xml (deflated 41%) 2022-11-23T02:58:27.3026646Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015048.xml (deflated 41%) 2022-11-23T02:58:27.3027507Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015051.xml (deflated 41%) 2022-11-23T02:58:27.3028293Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015053.xml (deflated 41%) 2022-11-23T02:58:27.3029597Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015056.xml (deflated 41%) 2022-11-23T02:58:27.3030417Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015058.xml (deflated 41%) 2022-11-23T02:58:27.3031201Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015100.xml (deflated 41%) 2022-11-23T02:58:27.3031980Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015103.xml (deflated 41%) 2022-11-23T02:58:27.3032836Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015105.xml (deflated 41%) 2022-11-23T02:58:27.3033645Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015112.xml (deflated 41%) 2022-11-23T02:58:27.3034422Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015114.xml (deflated 41%) 2022-11-23T02:58:27.3035196Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015117.xml (deflated 41%) 2022-11-23T02:58:27.3035961Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015119.xml (deflated 40%) 2022-11-23T02:58:27.3036742Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015126.xml (deflated 40%) 2022-11-23T02:58:27.3037529Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015135.xml (deflated 40%) 2022-11-23T02:58:27.3038321Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015144.xml (deflated 41%) 2022-11-23T02:58:27.3039087Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015154.xml (deflated 40%) 2022-11-23T02:58:27.3039872Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015203.xml (deflated 42%) 2022-11-23T02:58:27.3040652Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015210.xml (deflated 42%) 2022-11-23T02:58:27.3041429Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015216.xml (deflated 42%) 2022-11-23T02:58:27.3042190Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015223.xml (deflated 42%) 2022-11-23T02:58:27.3042985Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015230.xml (deflated 40%) 2022-11-23T02:58:27.3043765Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015239.xml (deflated 40%) 2022-11-23T02:58:27.3044549Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015249.xml (deflated 40%) 2022-11-23T02:58:27.3045313Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015258.xml (deflated 41%) 2022-11-23T02:58:27.3046091Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015307.xml (deflated 41%) 2022-11-23T02:58:27.3046870Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015317.xml (deflated 41%) 2022-11-23T02:58:27.3047748Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015325.xml (deflated 40%) 2022-11-23T02:58:27.3048508Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015334.xml (deflated 40%) 2022-11-23T02:58:27.3049284Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015344.xml (deflated 40%) 2022-11-23T02:58:27.3050111Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015353.xml (deflated 41%) 2022-11-23T02:58:27.3050893Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015355.xml (deflated 41%) 2022-11-23T02:58:27.3051652Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015357.xml (deflated 40%) 2022-11-23T02:58:27.3052499Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015400.xml (deflated 42%) 2022-11-23T02:58:27.3053295Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015402.xml (deflated 42%) 2022-11-23T02:58:27.3054076Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015405.xml (deflated 42%) 2022-11-23T02:58:27.3054839Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015407.xml (deflated 41%) 2022-11-23T02:58:27.3055619Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015410.xml (deflated 42%) 2022-11-23T02:58:27.3056398Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015412.xml (deflated 42%) 2022-11-23T02:58:27.3057183Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015414.xml (deflated 42%) 2022-11-23T02:58:27.3057965Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015417.xml (deflated 42%) 2022-11-23T02:58:27.3058725Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015419.xml (deflated 42%) 2022-11-23T02:58:27.3059507Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015422.xml (deflated 42%) 2022-11-23T02:58:27.3060285Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015424.xml (deflated 42%) 2022-11-23T02:58:27.3061059Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015426.xml (deflated 42%) 2022-11-23T02:58:27.3061832Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015429.xml (deflated 42%) 2022-11-23T02:58:27.3062627Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015431.xml (deflated 42%) 2022-11-23T02:58:27.3063407Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015434.xml (deflated 42%) 2022-11-23T02:58:27.3064186Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015436.xml (deflated 42%) 2022-11-23T02:58:27.3064949Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015438.xml (deflated 42%) 2022-11-23T02:58:27.3065729Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015441.xml (deflated 42%) 2022-11-23T02:58:27.3066507Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015443.xml (deflated 42%) 2022-11-23T02:58:27.3067355Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015446.xml (deflated 42%) 2022-11-23T02:58:27.3068176Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015448.xml (deflated 42%) 2022-11-23T02:58:27.3069402Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015450.xml (deflated 42%) 2022-11-23T02:58:27.3070320Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015453.xml (deflated 42%) 2022-11-23T02:58:27.3071100Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015455.xml (deflated 42%) 2022-11-23T02:58:27.3071864Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015458.xml (deflated 41%) 2022-11-23T02:58:27.3072739Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015507.xml (deflated 42%) 2022-11-23T02:58:27.3073533Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015514.xml (deflated 42%) 2022-11-23T02:58:27.3074320Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015516.xml (deflated 42%) 2022-11-23T02:58:27.3075087Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015519.xml (deflated 40%) 2022-11-23T02:58:27.3075863Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015528.xml (deflated 42%) 2022-11-23T02:58:27.3076641Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015531.xml (deflated 42%) 2022-11-23T02:58:27.3077424Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015537.xml (deflated 42%) 2022-11-23T02:58:27.3078199Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015540.xml (deflated 42%) 2022-11-23T02:58:27.3078980Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015547.xml (deflated 42%) 2022-11-23T02:58:27.3079764Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015549.xml (deflated 42%) 2022-11-23T02:58:27.3080541Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015551.xml (deflated 43%) 2022-11-23T02:58:27.3081309Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015554.xml (deflated 40%) 2022-11-23T02:58:27.3082088Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015556.xml (deflated 40%) 2022-11-23T02:58:27.3082873Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015559.xml (deflated 41%) 2022-11-23T02:58:27.3083654Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015601.xml (deflated 41%) 2022-11-23T02:58:27.3084419Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015603.xml (deflated 40%) 2022-11-23T02:58:27.3085207Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015606.xml (deflated 41%) 2022-11-23T02:58:27.3085989Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015608.xml (deflated 41%) 2022-11-23T02:58:27.3087127Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015611.xml (deflated 41%) 2022-11-23T02:58:27.3088164Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015613.xml (deflated 41%) 2022-11-23T02:58:27.3089084Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015615.xml (deflated 41%) 2022-11-23T02:58:27.3090015Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015618.xml (deflated 40%) 2022-11-23T02:58:27.3090918Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015627.xml (deflated 41%) 2022-11-23T02:58:27.3091833Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015629.xml (deflated 40%) 2022-11-23T02:58:27.3092683Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015639.xml (deflated 42%) 2022-11-23T02:58:27.3093760Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015645.xml (deflated 40%) 2022-11-23T02:58:27.3094684Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015654.xml (deflated 42%) 2022-11-23T02:58:27.3095472Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015658.xml (deflated 42%) 2022-11-23T02:58:27.3096255Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015702.xml (deflated 42%) 2022-11-23T02:58:27.3097111Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015707.xml (deflated 41%) 2022-11-23T02:58:27.3098087Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015716.xml (deflated 40%) 2022-11-23T02:58:27.3099014Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015725.xml (deflated 42%) 2022-11-23T02:58:27.3099871Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015729.xml (deflated 42%) 2022-11-23T02:58:27.3100771Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015733.xml (deflated 40%) 2022-11-23T02:58:27.3101759Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015742.xml (deflated 40%) 2022-11-23T02:58:27.3102618Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015751.xml (deflated 40%) 2022-11-23T02:58:27.3103519Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015800.xml (deflated 40%) 2022-11-23T02:58:27.3104457Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015809.xml (deflated 42%) 2022-11-23T02:58:27.3105284Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015814.xml (deflated 40%) 2022-11-23T02:58:27.3106065Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015822.xml (deflated 42%) 2022-11-23T02:58:27.3106844Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015826.xml (deflated 40%) 2022-11-23T02:58:27.3107627Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015836.xml (deflated 42%) 2022-11-23T02:58:27.3108393Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015840.xml (deflated 42%) 2022-11-23T02:58:27.3109748Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015844.xml (deflated 40%) 2022-11-23T02:58:27.3110675Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015853.xml (deflated 40%) 2022-11-23T02:58:27.3111459Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015903.xml (deflated 42%) 2022-11-23T02:58:27.3112227Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015907.xml (deflated 40%) 2022-11-23T02:58:27.3113007Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015917.xml (deflated 42%) 2022-11-23T02:58:27.3113784Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015919.xml (deflated 42%) 2022-11-23T02:58:27.3114561Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015921.xml (deflated 42%) 2022-11-23T02:58:27.3115408Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015924.xml (deflated 41%) 2022-11-23T02:58:27.3116205Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015926.xml (deflated 41%) 2022-11-23T02:58:27.3116988Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015929.xml (deflated 41%) 2022-11-23T02:58:27.3117767Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015931.xml (deflated 41%) 2022-11-23T02:58:27.3118532Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015933.xml (deflated 41%) 2022-11-23T02:58:27.3119314Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015936.xml (deflated 41%) 2022-11-23T02:58:27.3120115Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015938.xml (deflated 41%) 2022-11-23T02:58:27.3120895Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015941.xml (deflated 41%) 2022-11-23T02:58:27.3121661Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015943.xml (deflated 42%) 2022-11-23T02:58:27.3122438Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015945.xml (deflated 42%) 2022-11-23T02:58:27.3123215Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015950.xml (deflated 41%) 2022-11-23T02:58:27.3123999Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123015959.xml (deflated 40%) 2022-11-23T02:58:27.3124773Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020008.xml (deflated 40%) 2022-11-23T02:58:27.3125558Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020017.xml (deflated 40%) 2022-11-23T02:58:27.3126335Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020026.xml (deflated 40%) 2022-11-23T02:58:27.3127120Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020035.xml (deflated 40%) 2022-11-23T02:58:27.3127881Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020054.xml (deflated 41%) 2022-11-23T02:58:27.3128664Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020104.xml (deflated 41%) 2022-11-23T02:58:27.3129447Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020113.xml (deflated 41%) 2022-11-23T02:58:27.3130294Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020122.xml (deflated 40%) 2022-11-23T02:58:27.3131052Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020132.xml (deflated 42%) 2022-11-23T02:58:27.3131836Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020136.xml (deflated 43%) 2022-11-23T02:58:27.3132613Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020140.xml (deflated 42%) 2022-11-23T02:58:27.3133386Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020150.xml (deflated 41%) 2022-11-23T02:58:27.3134151Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020159.xml (deflated 42%) 2022-11-23T02:58:27.3134989Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020203.xml (deflated 41%) 2022-11-23T02:58:27.3135786Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020212.xml (deflated 42%) 2022-11-23T02:58:27.3136562Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020216.xml (deflated 41%) 2022-11-23T02:58:27.3137338Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020226.xml (deflated 42%) 2022-11-23T02:58:27.3138104Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020235.xml (deflated 41%) 2022-11-23T02:58:27.3138885Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020244.xml (deflated 42%) 2022-11-23T02:58:27.3139667Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020248.xml (deflated 42%) 2022-11-23T02:58:27.3140450Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020252.xml (deflated 42%) 2022-11-23T02:58:27.3141214Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020256.xml (deflated 41%) 2022-11-23T02:58:27.3141997Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020305.xml (deflated 41%) 2022-11-23T02:58:27.3142780Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020314.xml (deflated 40%) 2022-11-23T02:58:27.3143556Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020321.xml (deflated 41%) 2022-11-23T02:58:27.3144325Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020328.xml (deflated 42%) 2022-11-23T02:58:27.3145107Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020332.xml (deflated 42%) 2022-11-23T02:58:27.3145884Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020336.xml (deflated 40%) 2022-11-23T02:58:27.3146664Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020343.xml (deflated 41%) 2022-11-23T02:58:27.3147428Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020346.xml (deflated 41%) 2022-11-23T02:58:27.3148206Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020348.xml (deflated 42%) 2022-11-23T02:58:27.3149655Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020350.xml (deflated 41%) 2022-11-23T02:58:27.3150510Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020353.xml (deflated 41%) 2022-11-23T02:58:27.3151272Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020355.xml (deflated 40%) 2022-11-23T02:58:27.3152055Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020358.xml (deflated 40%) 2022-11-23T02:58:27.3152835Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020400.xml (deflated 42%) 2022-11-23T02:58:27.3153724Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020407.xml (deflated 42%) 2022-11-23T02:58:27.3154756Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020409.xml (deflated 41%) 2022-11-23T02:58:27.3155764Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020416.xml (deflated 40%) 2022-11-23T02:58:27.3156645Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020423.xml (deflated 40%) 2022-11-23T02:58:27.3157544Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020430.xml (deflated 40%) 2022-11-23T02:58:27.3158422Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020439.xml (deflated 41%) 2022-11-23T02:58:27.3159381Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020448.xml (deflated 40%) 2022-11-23T02:58:27.3160276Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020457.xml (deflated 40%) 2022-11-23T02:58:27.3161180Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020506.xml (deflated 41%) 2022-11-23T02:58:27.3162130Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020516.xml (deflated 41%) 2022-11-23T02:58:27.3163091Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020541.xml (deflated 41%) 2022-11-23T02:58:27.3164010Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020606.xml (deflated 42%) 2022-11-23T02:58:27.3164910Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020609.xml (deflated 41%) 2022-11-23T02:58:27.3165762Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020611.xml (deflated 41%) 2022-11-23T02:58:27.3166782Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020614.xml (deflated 41%) 2022-11-23T02:58:27.3167763Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020616.xml (deflated 41%) 2022-11-23T02:58:27.3168575Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020618.xml (deflated 42%) 2022-11-23T02:58:27.3169340Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020621.xml (deflated 42%) 2022-11-23T02:58:27.3170120Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020623.xml (deflated 42%) 2022-11-23T02:58:27.3170909Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020626.xml (deflated 42%) 2022-11-23T02:58:27.3171817Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020628.xml (deflated 42%) 2022-11-23T02:58:27.3172583Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020630.xml (deflated 42%) 2022-11-23T02:58:27.3173366Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020633.xml (deflated 42%) 2022-11-23T02:58:27.3174143Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020635.xml (deflated 41%) 2022-11-23T02:58:27.3174925Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020638.xml (deflated 40%) 2022-11-23T02:58:27.3175686Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020644.xml (deflated 40%) 2022-11-23T02:58:27.3176522Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020651.xml (deflated 41%) 2022-11-23T02:58:27.3177314Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020654.xml (deflated 42%) 2022-11-23T02:58:27.3178097Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020656.xml (deflated 42%) 2022-11-23T02:58:27.3178877Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020700.xml (deflated 40%) 2022-11-23T02:58:27.3179647Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020710.xml (deflated 41%) 2022-11-23T02:58:27.3180427Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020719.xml (deflated 40%) 2022-11-23T02:58:27.3181208Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020729.xml (deflated 42%) 2022-11-23T02:58:27.3181999Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020733.xml (deflated 41%) 2022-11-23T02:58:27.3182773Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020737.xml (deflated 40%) 2022-11-23T02:58:27.3183546Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020744.xml (deflated 40%) 2022-11-23T02:58:27.3184325Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020751.xml (deflated 42%) 2022-11-23T02:58:27.3185100Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020755.xml (deflated 40%) 2022-11-23T02:58:27.3185865Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020804.xml (deflated 40%) 2022-11-23T02:58:27.3186762Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020814.xml (deflated 40%) 2022-11-23T02:58:27.3188275Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020823.xml (deflated 40%) 2022-11-23T02:58:27.3190192Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020832.xml (deflated 42%) 2022-11-23T02:58:27.3191002Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020839.xml (deflated 42%) 2022-11-23T02:58:27.3191793Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020846.xml (deflated 42%) 2022-11-23T02:58:27.3192579Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020853.xml (deflated 42%) 2022-11-23T02:58:27.3193531Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020900.xml (deflated 41%) 2022-11-23T02:58:27.3194305Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020909.xml (deflated 41%) 2022-11-23T02:58:27.3195088Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020918.xml (deflated 42%) 2022-11-23T02:58:27.3195873Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020921.xml (deflated 41%) 2022-11-23T02:58:27.3196657Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020930.xml (deflated 43%) 2022-11-23T02:58:27.3197429Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020932.xml (deflated 43%) 2022-11-23T02:58:27.3198289Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020935.xml (deflated 41%) 2022-11-23T02:58:27.3199087Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020944.xml (deflated 42%) 2022-11-23T02:58:27.3199877Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020946.xml (deflated 42%) 2022-11-23T02:58:27.3200645Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020949.xml (deflated 40%) 2022-11-23T02:58:27.3201427Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123020958.xml (deflated 41%) 2022-11-23T02:58:27.3202206Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021000.xml (deflated 41%) 2022-11-23T02:58:27.3202995Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021003.xml (deflated 41%) 2022-11-23T02:58:27.3203767Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021005.xml (deflated 42%) 2022-11-23T02:58:27.3204551Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021007.xml (deflated 41%) 2022-11-23T02:58:27.3205328Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021010.xml (deflated 41%) 2022-11-23T02:58:27.3206107Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021012.xml (deflated 41%) 2022-11-23T02:58:27.3206878Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021015.xml (deflated 41%) 2022-11-23T02:58:27.3207667Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021017.xml (deflated 41%) 2022-11-23T02:58:27.3208451Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021026.xml (deflated 42%) 2022-11-23T02:58:27.3209228Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021028.xml (deflated 42%) 2022-11-23T02:58:27.3209990Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021031.xml (deflated 42%) 2022-11-23T02:58:27.3210775Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021033.xml (deflated 40%) 2022-11-23T02:58:27.3211555Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021042.xml (deflated 41%) 2022-11-23T02:58:27.3212329Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021045.xml (deflated 41%) 2022-11-23T02:58:27.3213170Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021047.xml (deflated 41%) 2022-11-23T02:58:27.3213955Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021049.xml (deflated 40%) 2022-11-23T02:58:27.3214735Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021059.xml (deflated 41%) 2022-11-23T02:58:27.3215511Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021107.xml (deflated 40%) 2022-11-23T02:58:27.3216274Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021117.xml (deflated 41%) 2022-11-23T02:58:27.3217055Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021126.xml (deflated 42%) 2022-11-23T02:58:27.3217891Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021128.xml (deflated 42%) 2022-11-23T02:58:27.3218683Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021131.xml (deflated 41%) 2022-11-23T02:58:27.3219448Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021140.xml (deflated 41%) 2022-11-23T02:58:27.3220234Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021149.xml (deflated 42%) 2022-11-23T02:58:27.3221011Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021151.xml (deflated 41%) 2022-11-23T02:58:27.3221795Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021200.xml (deflated 41%) 2022-11-23T02:58:27.3222567Z adding: test/test-reports/dist-ucc/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20221123021214.xml (deflated 40%) 2022-11-23T02:58:27.3223369Z adding: test/test-reports/python-unittest/distributed.checkpoint.test_dedup_tensors/TEST-TestDedupTensor-20221123021256.xml (deflated 40%) 2022-11-23T02:58:27.3224158Z adding: test/test-reports/python-unittest/distributed._composable.test_checkpoint/TEST-TestCheckpoint-20221123021304.xml (deflated 55%) 2022-11-23T02:58:27.3224941Z adding: test/test-reports/python-unittest/distributed.checkpoint.test_utils/TEST-TestMedatadaIndex-20221123021307.xml (deflated 71%) 2022-11-23T02:58:27.3225745Z adding: test/test-reports/python-unittest/distributed.fsdp.test_utils/TEST-TestGetSubmoduleToStates-20221123021311.xml (deflated 43%) 2022-11-23T02:58:27.3226485Z adding: test/test-reports/python-unittest/distributed.fsdp.test_utils/TEST-TestUtils-20221123021311.xml (deflated 69%) 2022-11-23T02:58:27.3227284Z adding: test/test-reports/python-unittest/distributed._shard.sharded_optim.test_sharded_optim/TEST-TestShardedOptimizer-20221123021315.xml (deflated 52%) 2022-11-23T02:58:27.3228084Z adding: test/test-reports/python-unittest/distributed.test_data_parallel/TEST-TestDataParallel-20221123021323.xml (deflated 83%) 2022-11-23T02:58:27.3228903Z adding: test/test-reports/python-unittest/distributed.test_data_parallel/TEST-TestDataParallelDeviceTypeCUDA-20221123021323.xml (deflated 90%) 2022-11-23T02:58:27.3230457Z adding: test/test-reports/python-unittest/distributed.elastic.utils.distributed_test/TEST-DistributedUtilTest-20221123021330.xml (deflated 71%) 2022-11-23T02:58:27.3231286Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_uneven/TEST-TestUnevenParamShard-20221123021337.xml (deflated 41%) 2022-11-23T02:58:27.3232064Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_pure_fp16/TEST-TestPureFP16-20221123021346.xml (deflated 55%) 2022-11-23T02:58:27.3232864Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_softmax/TEST-TestShardedSoftmax-20221123021354.xml (deflated 59%) 2022-11-23T02:58:27.3233824Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_chunk/TEST-TestShardedTensorChunkOps-20221123021403.xml (deflated 60%) 2022-11-23T02:58:27.3234645Z adding: test/test-reports/python-unittest/distributed.test_c10d_error_logger/TEST-C10dErrorLoggerTest-20221123021411.xml (deflated 53%) 2022-11-23T02:58:27.3235473Z adding: test/test-reports/python-unittest/distributed._shard.sharding_spec.test_sharding_spec/TEST-TestCustomShardingSpec-20221123021422.xml (deflated 66%) 2022-11-23T02:58:27.3236314Z adding: test/test-reports/python-unittest/distributed._shard.sharding_spec.test_sharding_spec/TEST-TestShardingSpec-20221123021422.xml (deflated 79%) 2022-11-23T02:58:27.3237062Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_input/TEST-TestInput-20221123021433.xml (deflated 57%) 2022-11-23T02:58:27.3237994Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_elementwise_ops/TEST-TestShardedTensorElementWiseOps-20221123021445.xml (deflated 74%) 2022-11-23T02:58:27.3238886Z adding: test/test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorOps-20221123021459.xml (deflated 67%) 2022-11-23T02:58:27.3239708Z adding: test/test-reports/python-unittest/distributed._shard.test_partial_tensor/TEST-TestPartialTensorReshard-20221123021459.xml (deflated 61%) 2022-11-23T02:58:27.3240480Z adding: test/test-reports/python-unittest/distributed._tensor.test_math_ops/TEST-DistMathOpsTest-20221123021515.xml (deflated 61%) 2022-11-23T02:58:27.3241323Z adding: test/test-reports/python-unittest/distributed._tensor.parallel.test_tp_examples/TEST-DistTensorParallelExampleTest-20221123021531.xml (deflated 64%) 2022-11-23T02:58:27.3242147Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_memory/TEST-TestFSDPMemory-20221123021548.xml (deflated 54%) 2022-11-23T02:58:27.3242947Z adding: test/test-reports/python-unittest/distributed._tensor.test_pointwise_ops/TEST-DistElementwiseOpsTest-20221123021607.xml (deflated 68%) 2022-11-23T02:58:27.3243760Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_tp_integration/TEST-TestTPFSDPIntegration-20221123021627.xml (deflated 80%) 2022-11-23T02:58:27.3244579Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_clip_grad_norm/TEST-TestClipGradNorm-20221123021651.xml (deflated 55%) 2022-11-23T02:58:27.3245437Z adding: test/test-reports/python-unittest/distributed._shard.sharded_tensor.ops.test_matrix_ops/TEST-TestShardedTensorMatrixOps-20221123021721.xml (deflated 86%) 2022-11-23T02:58:27.3246304Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_ucc/TEST-TestDistributedNNFunctionsUcc-20221123021756.xml (deflated 42%) 2022-11-23T02:58:27.3247152Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_ucc/TEST-TestDistributedNNFunctionsUcc-20221123021800.xml (deflated 42%) 2022-11-23T02:58:27.3247990Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_ucc/TEST-TestDistributedNNFunctionsUcc-20221123021810.xml (deflated 42%) 2022-11-23T02:58:27.3248827Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_ucc/TEST-TestDistributedNNFunctionsUcc-20221123021819.xml (deflated 42%) 2022-11-23T02:58:27.3249666Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_ucc/TEST-TestDistributedNNFunctionsUcc-20221123021829.xml (deflated 42%) 2022-11-23T02:58:27.3250563Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_ucc/TEST-TestDistributedNNFunctionsUcc-20221123021838.xml (deflated 42%) 2022-11-23T02:58:27.3251342Z adding: test/test-reports/python-unittest/distributed._tensor.test_matrix_ops/TEST-DistMatrixOpsTest-20221123021847.xml (deflated 75%) 2022-11-23T02:58:27.3252130Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_flatten_params/TEST-TestFlattenParams-20221123021924.xml (deflated 77%) 2022-11-23T02:58:27.3253090Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-DistributedDataParallelSingleProcessTest-20221123022008.xml (deflated 44%) 2022-11-23T02:58:27.3254038Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-DistributedDataParallelSingleProcessTest-20221123022012.xml (deflated 44%) 2022-11-23T02:58:27.3254959Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-DistributedDataParallelSingleProcessTest-20221123022017.xml (deflated 44%) 2022-11-23T02:58:27.3255863Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022022.xml (deflated 42%) 2022-11-23T02:58:27.3256726Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022031.xml (deflated 42%) 2022-11-23T02:58:27.3257636Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022040.xml (deflated 43%) 2022-11-23T02:58:27.3258491Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022050.xml (deflated 42%) 2022-11-23T02:58:27.3259438Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022059.xml (deflated 42%) 2022-11-23T02:58:27.3260287Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022108.xml (deflated 42%) 2022-11-23T02:58:27.3261125Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022117.xml (deflated 42%) 2022-11-23T02:58:27.3261976Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_gloo/TEST-TestDistributedNNFunctionsGloo-20221123022126.xml (deflated 42%) 2022-11-23T02:58:27.3262837Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022140.xml (deflated 42%) 2022-11-23T02:58:27.3263677Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022149.xml (deflated 42%) 2022-11-23T02:58:27.3264511Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022158.xml (deflated 43%) 2022-11-23T02:58:27.3265361Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022208.xml (deflated 43%) 2022-11-23T02:58:27.3266212Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022217.xml (deflated 42%) 2022-11-23T02:58:27.3267054Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022226.xml (deflated 42%) 2022-11-23T02:58:27.3267898Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022236.xml (deflated 42%) 2022-11-23T02:58:27.3268746Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022245.xml (deflated 42%) 2022-11-23T02:58:27.3270357Z adding: test/test-reports/python-unittest/distributed.test_c10d_spawn_nccl/TEST-TestDistributedNNFunctionsNccl-20221123022254.xml (deflated 42%) 2022-11-23T02:58:27.3271203Z adding: test/test-reports/python-unittest/distributed._tensor.test_device_mesh/TEST-DeviceMeshCollectiveTest-20221123022302.xml (deflated 88%) 2022-11-23T02:58:27.3271966Z adding: test/test-reports/python-unittest/distributed._tensor.test_device_mesh/TEST-DeviceMeshTest-20221123022302.xml (deflated 73%) 2022-11-23T02:58:27.3272764Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022358.xml (deflated 40%) 2022-11-23T02:58:27.3273727Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022404.xml (deflated 41%) 2022-11-23T02:58:27.3274552Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022411.xml (deflated 41%) 2022-11-23T02:58:27.3275355Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022417.xml (deflated 41%) 2022-11-23T02:58:27.3276172Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022423.xml (deflated 40%) 2022-11-23T02:58:27.3277000Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022430.xml (deflated 41%) 2022-11-23T02:58:27.3277819Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022436.xml (deflated 41%) 2022-11-23T02:58:27.3278693Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022443.xml (deflated 41%) 2022-11-23T02:58:27.3279524Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupGlooWrapperTest-20221123022449.xml (deflated 40%) 2022-11-23T02:58:27.3280356Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123022456.xml (deflated 39%) 2022-11-23T02:58:27.3281176Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123022502.xml (deflated 39%) 2022-11-23T02:58:27.3281990Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123022510.xml (deflated 39%) 2022-11-23T02:58:27.3282788Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123022518.xml (deflated 40%) 2022-11-23T02:58:27.3283622Z adding: test/test-reports/python-unittest/distributed.test_pg_wrapper/TEST-ProcessGroupNCCLWrapperTest-20221123022527.xml (deflated 39%) 2022-11-23T02:58:27.3284445Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_comm_hooks/TEST-TestCommunicationHooks-20221123022536.xml (deflated 91%) 2022-11-23T02:58:27.3285367Z adding: test/test-reports/python-unittest/distributed.optim.test_zero_redundancy_optimizer/TEST-TestZeroRedundancyOptimizerDistributed-20221123022737.xml (deflated 90%) 2022-11-23T02:58:27.3286351Z adding: test/test-reports/python-unittest/distributed.optim.test_zero_redundancy_optimizer/TEST-TestZeroRedundancyOptimizerSingleRank-20221123022737.xml (deflated 73%) 2022-11-23T02:58:27.3287243Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_optim_state/TEST-TestFSDPOptimState-20221123023037.xml (deflated 93%) 2022-11-23T02:58:27.3287974Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023450.xml (deflated 38%) 2022-11-23T02:58:27.3288660Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023457.xml (deflated 38%) 2022-11-23T02:58:27.3289316Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023505.xml (deflated 38%) 2022-11-23T02:58:27.3289986Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023511.xml (deflated 38%) 2022-11-23T02:58:27.3290652Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023517.xml (deflated 38%) 2022-11-23T02:58:27.3291318Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023525.xml (deflated 38%) 2022-11-23T02:58:27.3291974Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023533.xml (deflated 39%) 2022-11-23T02:58:27.3292639Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023540.xml (deflated 38%) 2022-11-23T02:58:27.3293376Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023546.xml (deflated 37%) 2022-11-23T02:58:27.3294043Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023552.xml (deflated 37%) 2022-11-23T02:58:27.3294692Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CommTest-20221123023559.xml (deflated 37%) 2022-11-23T02:58:27.3295381Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023605.xml (deflated 38%) 2022-11-23T02:58:27.3296082Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023611.xml (deflated 37%) 2022-11-23T02:58:27.3296776Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023619.xml (deflated 38%) 2022-11-23T02:58:27.3297450Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023626.xml (deflated 38%) 2022-11-23T02:58:27.3298217Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023634.xml (deflated 38%) 2022-11-23T02:58:27.3298917Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023640.xml (deflated 38%) 2022-11-23T02:58:27.3299587Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023648.xml (deflated 38%) 2022-11-23T02:58:27.3300269Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023654.xml (deflated 38%) 2022-11-23T02:58:27.3300970Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023702.xml (deflated 38%) 2022-11-23T02:58:27.3301652Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023709.xml (deflated 38%) 2022-11-23T02:58:27.3302329Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-CompilerTest-20221123023715.xml (deflated 38%) 2022-11-23T02:58:27.3303083Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023723.xml (deflated 44%) 2022-11-23T02:58:27.3303902Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023731.xml (deflated 45%) 2022-11-23T02:58:27.3304711Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023740.xml (deflated 43%) 2022-11-23T02:58:27.3305502Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023748.xml (deflated 42%) 2022-11-23T02:58:27.3306313Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023757.xml (deflated 45%) 2022-11-23T02:58:27.3307118Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023805.xml (deflated 45%) 2022-11-23T02:58:27.3307926Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023813.xml (deflated 46%) 2022-11-23T02:58:27.3308726Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023822.xml (deflated 46%) 2022-11-23T02:58:27.3310034Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023830.xml (deflated 44%) 2022-11-23T02:58:27.3310839Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023839.xml (deflated 45%) 2022-11-23T02:58:27.3311644Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023847.xml (deflated 46%) 2022-11-23T02:58:27.3312443Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023856.xml (deflated 44%) 2022-11-23T02:58:27.3313352Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023904.xml (deflated 44%) 2022-11-23T02:58:27.3314150Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023913.xml (deflated 43%) 2022-11-23T02:58:27.3314953Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023919.xml (deflated 44%) 2022-11-23T02:58:27.3315758Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023927.xml (deflated 45%) 2022-11-23T02:58:27.3316545Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023933.xml (deflated 44%) 2022-11-23T02:58:27.3317336Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023939.xml (deflated 45%) 2022-11-23T02:58:27.3318207Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023946.xml (deflated 45%) 2022-11-23T02:58:27.3319021Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123023952.xml (deflated 50%) 2022-11-23T02:58:27.3319805Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024000.xml (deflated 42%) 2022-11-23T02:58:27.3320599Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024008.xml (deflated 41%) 2022-11-23T02:58:27.3321394Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024016.xml (deflated 41%) 2022-11-23T02:58:27.3322190Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024024.xml (deflated 41%) 2022-11-23T02:58:27.3322983Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024032.xml (deflated 41%) 2022-11-23T02:58:27.3323776Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024040.xml (deflated 42%) 2022-11-23T02:58:27.3324578Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024047.xml (deflated 42%) 2022-11-23T02:58:27.3325373Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024053.xml (deflated 41%) 2022-11-23T02:58:27.3326158Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024059.xml (deflated 41%) 2022-11-23T02:58:27.3326953Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024106.xml (deflated 44%) 2022-11-23T02:58:27.3327758Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024112.xml (deflated 45%) 2022-11-23T02:58:27.3328561Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024118.xml (deflated 41%) 2022-11-23T02:58:27.3329343Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024126.xml (deflated 41%) 2022-11-23T02:58:27.3330140Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024133.xml (deflated 41%) 2022-11-23T02:58:27.3330931Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024141.xml (deflated 41%) 2022-11-23T02:58:27.3331727Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024147.xml (deflated 42%) 2022-11-23T02:58:27.3332581Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024154.xml (deflated 41%) 2022-11-23T02:58:27.3333383Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-DistributedDataParallelTest-20221123024203.xml (deflated 41%) 2022-11-23T02:58:27.3334287Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123024211.xml (deflated 42%) 2022-11-23T02:58:27.3335268Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123024218.xml (deflated 42%) 2022-11-23T02:58:27.3336222Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123024224.xml (deflated 44%) 2022-11-23T02:58:27.3337187Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-GlooProcessGroupWithDispatchedCollectivesTests-20221123024230.xml (deflated 42%) 2022-11-23T02:58:27.3338110Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024236.xml (deflated 39%) 2022-11-23T02:58:27.3338883Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024243.xml (deflated 39%) 2022-11-23T02:58:27.3339609Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123024826.xml (deflated 39%) 2022-11-23T02:58:27.3340320Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024251.xml (deflated 39%) 2022-11-23T02:58:27.3341076Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024258.xml (deflated 40%) 2022-11-23T02:58:27.3341837Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024304.xml (deflated 40%) 2022-11-23T02:58:27.3342597Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024310.xml (deflated 39%) 2022-11-23T02:58:27.3343334Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024317.xml (deflated 40%) 2022-11-23T02:58:27.3344092Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024324.xml (deflated 40%) 2022-11-23T02:58:27.3344847Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024334.xml (deflated 40%) 2022-11-23T02:58:27.3345600Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024340.xml (deflated 40%) 2022-11-23T02:58:27.3346337Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024349.xml (deflated 39%) 2022-11-23T02:58:27.3347094Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024357.xml (deflated 39%) 2022-11-23T02:58:27.3347850Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024404.xml (deflated 40%) 2022-11-23T02:58:27.3348602Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024410.xml (deflated 40%) 2022-11-23T02:58:27.3349767Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024417.xml (deflated 40%) 2022-11-23T02:58:27.3350563Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024423.xml (deflated 39%) 2022-11-23T02:58:27.3351321Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024429.xml (deflated 39%) 2022-11-23T02:58:27.3352074Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024438.xml (deflated 40%) 2022-11-23T02:58:27.3352932Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024445.xml (deflated 40%) 2022-11-23T02:58:27.3353686Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024451.xml (deflated 40%) 2022-11-23T02:58:27.3354442Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024500.xml (deflated 40%) 2022-11-23T02:58:27.3355192Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024507.xml (deflated 40%) 2022-11-23T02:58:27.3355926Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024513.xml (deflated 40%) 2022-11-23T02:58:27.3356681Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024521.xml (deflated 40%) 2022-11-23T02:58:27.3357436Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024528.xml (deflated 40%) 2022-11-23T02:58:27.3358254Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024535.xml (deflated 40%) 2022-11-23T02:58:27.3359006Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024543.xml (deflated 40%) 2022-11-23T02:58:27.3359756Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024550.xml (deflated 39%) 2022-11-23T02:58:27.3360510Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024556.xml (deflated 39%) 2022-11-23T02:58:27.3361256Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024604.xml (deflated 39%) 2022-11-23T02:58:27.3361987Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024611.xml (deflated 39%) 2022-11-23T02:58:27.3362743Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024618.xml (deflated 40%) 2022-11-23T02:58:27.3363494Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024625.xml (deflated 39%) 2022-11-23T02:58:27.3364238Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024635.xml (deflated 39%) 2022-11-23T02:58:27.3364969Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024641.xml (deflated 39%) 2022-11-23T02:58:27.3365713Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024648.xml (deflated 39%) 2022-11-23T02:58:27.3366465Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024656.xml (deflated 39%) 2022-11-23T02:58:27.3367219Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024702.xml (deflated 40%) 2022-11-23T02:58:27.3368010Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024709.xml (deflated 39%) 2022-11-23T02:58:27.3368756Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024718.xml (deflated 39%) 2022-11-23T02:58:27.3369510Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024725.xml (deflated 40%) 2022-11-23T02:58:27.3393941Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024732.xml (deflated 40%) 2022-11-23T02:58:27.3394835Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024738.xml (deflated 39%) 2022-11-23T02:58:27.3395605Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024746.xml (deflated 39%) 2022-11-23T02:58:27.3396538Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024753.xml (deflated 39%) 2022-11-23T02:58:27.3397299Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024800.xml (deflated 41%) 2022-11-23T02:58:27.3398050Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024802.xml (deflated 40%) 2022-11-23T02:58:27.3398791Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024809.xml (deflated 41%) 2022-11-23T02:58:27.3399536Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024811.xml (deflated 40%) 2022-11-23T02:58:27.3400281Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ProcessGroupGlooTest-20221123024819.xml (deflated 39%) 2022-11-23T02:58:27.3401080Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123024828.xml (deflated 39%) 2022-11-23T02:58:27.3401788Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123024830.xml (deflated 39%) 2022-11-23T02:58:27.3402473Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123024832.xml (deflated 39%) 2022-11-23T02:58:27.3403167Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123024835.xml (deflated 38%) 2022-11-23T02:58:27.3403863Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-ReducerTest-20221123024837.xml (deflated 39%) 2022-11-23T02:58:27.3404588Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-RendezvousEnvTest-20221123024839.xml (deflated 39%) 2022-11-23T02:58:27.3405282Z adding: test/test-reports/python-unittest/distributed.test_c10d_gloo/TEST-TimeoutTest-20221123024843.xml (deflated 41%) 2022-11-23T02:58:27.3405995Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestHooks-20221123024847.xml (deflated 79%) 2022-11-23T02:58:27.3406716Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestNoGrad-20221123024847.xml (deflated 63%) 2022-11-23T02:58:27.3407434Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParamInit-20221123024847.xml (deflated 60%) 2022-11-23T02:58:27.3408198Z adding: test/test-reports/python-unittest/distributed.fsdp.test_fsdp_core/TEST-TestParityWithDDP-20221123024847.xml (deflated 90%) 2022-11-23T02:58:27.3429593Z ##[group]Run # Remove any previous test reports if they exist 2022-11-23T02:58:27.3430002Z # Remove any previous test reports if they exist 2022-11-23T02:58:27.3430306Z rm -f usage-log-*.zip 2022-11-23T02:58:27.3430689Z # this workflow is also run in bazel build test, but we dont generate usage reports for it 2022-11-23T02:58:27.3431101Z # so check to see if the file exists first 2022-11-23T02:58:27.3431402Z if [ -f 'usage_log.txt' ]; then 2022-11-23T02:58:27.3431735Z  zip "usage-log-${FILE_SUFFIX}.zip" 'usage_log.txt' 2022-11-23T02:58:27.3432031Z fi 2022-11-23T02:58:27.3443922Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T02:58:27.3444223Z env: 2022-11-23T02:58:27.3444463Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:58:27.3444706Z GPU_FLAG: --gpus all 2022-11-23T02:58:27.3445065Z DOCKER_CONTAINER_ID: 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:58:27.3445531Z FILE_SUFFIX: test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655200299 2022-11-23T02:58:27.3445870Z ##[endgroup] 2022-11-23T02:58:27.4304410Z adding: usage_log.txt (deflated 95%) 2022-11-23T02:58:27.4348741Z ##[group]Run seemethere/upload-artifact-s3@v5 2022-11-23T02:58:27.4349375Z with: 2022-11-23T02:58:27.4349648Z s3-prefix: pytorch/pytorch/3528293562/1/artifact 2022-11-23T02:58:27.4350115Z retention-days: 14 2022-11-23T02:58:27.4350393Z if-no-files-found: warn 2022-11-23T02:58:27.4350653Z path: test-jsons-*.zip 2022-11-23T02:58:27.4350908Z name: artifact 2022-11-23T02:58:27.4351161Z s3-bucket: gha-artifacts 2022-11-23T02:58:27.4351407Z region: us-east-1 2022-11-23T02:58:27.4351633Z env: 2022-11-23T02:58:27.4351863Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:58:27.4352109Z GPU_FLAG: --gpus all 2022-11-23T02:58:27.4352471Z DOCKER_CONTAINER_ID: 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:58:27.4352814Z ##[endgroup] 2022-11-23T02:58:27.8817072Z NOTE: s3-prefix specified, ignoring name parameter 2022-11-23T02:58:27.8817938Z With the provided path, there will be 1 file uploaded 2022-11-23T02:58:27.8818302Z Uploading to s3 prefix: pytorch/pytorch/3528293562/1/artifact 2022-11-23T02:58:27.8829458Z Starting upload of test-jsons-test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655200299.zip 2022-11-23T02:58:28.0180865Z Finished upload of test-jsons-test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655200299.zip 2022-11-23T02:58:28.0375859Z ##[group]Run seemethere/upload-artifact-s3@v5 2022-11-23T02:58:28.0376155Z with: 2022-11-23T02:58:28.0376421Z s3-prefix: pytorch/pytorch/3528293562/1/artifact 2022-11-23T02:58:28.0376720Z retention-days: 14 2022-11-23T02:58:28.0376999Z if-no-files-found: error 2022-11-23T02:58:28.0377261Z path: test-reports-*.zip 2022-11-23T02:58:28.0377516Z name: artifact 2022-11-23T02:58:28.0377765Z s3-bucket: gha-artifacts 2022-11-23T02:58:28.0378007Z region: us-east-1 2022-11-23T02:58:28.0378236Z env: 2022-11-23T02:58:28.0378473Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:58:28.0378720Z GPU_FLAG: --gpus all 2022-11-23T02:58:28.0379078Z DOCKER_CONTAINER_ID: 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:58:28.0379421Z ##[endgroup] 2022-11-23T02:58:28.4910154Z NOTE: s3-prefix specified, ignoring name parameter 2022-11-23T02:58:28.4911057Z With the provided path, there will be 1 file uploaded 2022-11-23T02:58:28.4911455Z Uploading to s3 prefix: pytorch/pytorch/3528293562/1/artifact 2022-11-23T02:58:28.4922746Z Starting upload of test-reports-test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655200299.zip 2022-11-23T02:58:28.6594421Z Finished upload of test-reports-test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655200299.zip 2022-11-23T02:58:28.6765841Z ##[group]Run seemethere/upload-artifact-s3@v5 2022-11-23T02:58:28.6766137Z with: 2022-11-23T02:58:28.6766404Z s3-prefix: pytorch/pytorch/3528293562/1/artifact 2022-11-23T02:58:28.6766704Z retention-days: 14 2022-11-23T02:58:28.6766975Z if-no-files-found: ignore 2022-11-23T02:58:28.6767237Z path: usage-log-*.zip 2022-11-23T02:58:28.6767489Z name: artifact 2022-11-23T02:58:28.6767738Z s3-bucket: gha-artifacts 2022-11-23T02:58:28.6767983Z region: us-east-1 2022-11-23T02:58:28.6768212Z env: 2022-11-23T02:58:28.6768452Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:58:28.6768709Z GPU_FLAG: --gpus all 2022-11-23T02:58:28.6769071Z DOCKER_CONTAINER_ID: 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:58:28.6769412Z ##[endgroup] 2022-11-23T02:58:29.1233410Z NOTE: s3-prefix specified, ignoring name parameter 2022-11-23T02:58:29.1234133Z With the provided path, there will be 1 file uploaded 2022-11-23T02:58:29.1234500Z Uploading to s3 prefix: pytorch/pytorch/3528293562/1/artifact 2022-11-23T02:58:29.1245963Z Starting upload of usage-log-test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655200299.zip 2022-11-23T02:58:29.2886433Z Finished upload of usage-log-test-distributed-3-3-linux.8xlarge.nvidia.gpu_9655200299.zip 2022-11-23T02:58:29.3056587Z ##[group]Run # shellcheck disable=SC2156 2022-11-23T02:58:29.3057110Z # shellcheck disable=SC2156 2022-11-23T02:58:29.3057520Z find . -iname "core.[1-9]*" -exec docker exec "${DOCKER_CONTAINER_ID}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2022-11-23T02:58:29.3071340Z shell: /usr/bin/bash -e {0} 2022-11-23T02:58:29.3071698Z env: 2022-11-23T02:58:29.3071943Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:58:29.3072198Z GPU_FLAG: --gpus all 2022-11-23T02:58:29.3072558Z DOCKER_CONTAINER_ID: 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:58:29.3072905Z ##[endgroup] 2022-11-23T02:58:29.6110495Z ##[group]Run set -x 2022-11-23T02:58:29.6110805Z set -x 2022-11-23T02:58:29.6111096Z python3 -m pip install -r requirements.txt 2022-11-23T02:58:29.6111438Z python3 -m pip install boto3==1.19.12 2022-11-23T02:58:29.6111842Z python3 -m tools.stats.print_test_stats --upload-to-s3 --compare-with-s3 test 2022-11-23T02:58:29.6124123Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T02:58:29.6124428Z env: 2022-11-23T02:58:29.6124672Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:58:29.6124922Z GPU_FLAG: --gpus all 2022-11-23T02:58:29.6125279Z DOCKER_CONTAINER_ID: 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:58:29.6125654Z AWS_DEFAULT_REGION: us-east-1 2022-11-23T02:58:29.6125897Z BRANCH: master 2022-11-23T02:58:29.6126149Z TEST_CONFIG: distributed 2022-11-23T02:58:29.6126404Z SHARD_NUMBER: 3 2022-11-23T02:58:29.6126722Z BUILD_ENVIRONMENT: linux-bionic-cuda11.6-py3.10-gcc7 2022-11-23T02:58:29.6127053Z PR_NUMBER: 2022-11-23T02:58:29.6127315Z PYTORCH_RETRY_TEST_CASES: 1 2022-11-23T02:58:29.6127588Z PYTORCH_OVERRIDE_FLAKY_SIGNAL: 1 2022-11-23T02:58:29.6127896Z SHA1: 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T02:58:29.6128165Z TAG: 2022-11-23T02:58:29.6128380Z WORKFLOW_ID: 3528293562 2022-11-23T02:58:29.6128820Z GITHUB_TOKEN: *** 2022-11-23T02:58:29.6129083Z GHA_WORKFLOW_JOB_ID: 9655200299 2022-11-23T02:58:29.6129327Z ##[endgroup] 2022-11-23T02:58:29.6158673Z + python3 -m pip install -r requirements.txt 2022-11-23T02:58:29.9123896Z Defaulting to user installation because normal site-packages is not writeable 2022-11-23T02:58:29.9967843Z Collecting astunparse 2022-11-23T02:58:30.0125903Z Downloading astunparse-1.6.3-py2.py3-none-any.whl (12 kB) 2022-11-23T02:58:30.0457504Z Collecting expecttest 2022-11-23T02:58:30.0500136Z Downloading expecttest-0.1.4-py3-none-any.whl (6.5 kB) 2022-11-23T02:58:30.0898837Z Collecting future 2022-11-23T02:58:30.0942231Z Downloading future-0.18.2.tar.gz (829 kB) 2022-11-23T02:58:31.9916347Z Collecting hypothesis 2022-11-23T02:58:31.9978136Z Downloading hypothesis-6.58.0-py3-none-any.whl (396 kB) 2022-11-23T02:58:32.8032733Z Collecting numpy 2022-11-23T02:58:32.8098583Z Downloading numpy-1.21.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB) 2022-11-23T02:58:33.1481330Z Requirement already satisfied: psutil in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 7)) (5.9.1) 2022-11-23T02:58:33.2671042Z Collecting pyyaml 2022-11-23T02:58:33.2769122Z Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB) 2022-11-23T02:58:33.2976139Z Requirement already satisfied: requests in /home/ec2-user/.local/lib/python3.7/site-packages (from -r requirements.txt (line 9)) (2.26.0) 2022-11-23T02:58:33.3155565Z Requirement already satisfied: setuptools in /usr/lib/python3.7/site-packages (from -r requirements.txt (line 10)) (49.1.3) 2022-11-23T02:58:33.3753278Z Collecting six 2022-11-23T02:58:33.4329886Z Downloading six-1.16.0-py2.py3-none-any.whl (11 kB) 2022-11-23T02:58:33.4672049Z Collecting types-dataclasses 2022-11-23T02:58:33.4738564Z Downloading types_dataclasses-0.6.6-py3-none-any.whl (2.9 kB) 2022-11-23T02:58:33.5165238Z Collecting typing_extensions 2022-11-23T02:58:33.5206838Z Downloading typing_extensions-4.4.0-py3-none-any.whl (26 kB) 2022-11-23T02:58:33.5765393Z Collecting sympy 2022-11-23T02:58:33.5840088Z Downloading sympy-1.10.1-py3-none-any.whl (6.4 MB) 2022-11-23T02:58:33.7823841Z Collecting filelock 2022-11-23T02:58:33.7870227Z Downloading filelock-3.8.0-py3-none-any.whl (10 kB) 2022-11-23T02:58:33.8805274Z Collecting networkx 2022-11-23T02:58:33.8932519Z Downloading networkx-2.6.3-py3-none-any.whl (1.9 MB) 2022-11-23T02:58:34.0209998Z Collecting jinja2 2022-11-23T02:58:34.0264651Z Downloading Jinja2-3.1.2-py3-none-any.whl (133 kB) 2022-11-23T02:58:34.1254217Z Collecting wheel<1.0,>=0.23.0 2022-11-23T02:58:34.1298335Z Downloading wheel-0.38.4-py3-none-any.whl (36 kB) 2022-11-23T02:58:34.1799758Z Collecting attrs>=19.2.0 2022-11-23T02:58:34.1899728Z Downloading attrs-22.1.0-py2.py3-none-any.whl (58 kB) 2022-11-23T02:58:34.2668149Z Collecting exceptiongroup>=1.0.0; python_version < "3.11" 2022-11-23T02:58:34.2710982Z Downloading exceptiongroup-1.0.4-py3-none-any.whl (14 kB) 2022-11-23T02:58:34.3249830Z Collecting sortedcontainers<3.0.0,>=2.1.0 2022-11-23T02:58:34.3291658Z Downloading sortedcontainers-2.4.0-py2.py3-none-any.whl (29 kB) 2022-11-23T02:58:34.3391523Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (1.26.12) 2022-11-23T02:58:34.3617236Z Requirement already satisfied: certifi>=2017.4.17 in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (2022.9.24) 2022-11-23T02:58:34.3629010Z Requirement already satisfied: idna<4,>=2.5; python_version >= "3" in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (3.4) 2022-11-23T02:58:34.3645085Z Requirement already satisfied: charset-normalizer~=2.0.0; python_version >= "3" in /home/ec2-user/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 9)) (2.0.12) 2022-11-23T02:58:34.3930428Z Collecting mpmath>=0.19 2022-11-23T02:58:34.4226207Z Downloading mpmath-1.2.1-py3-none-any.whl (532 kB) 2022-11-23T02:58:34.5737801Z Collecting MarkupSafe>=2.0 2022-11-23T02:58:34.5892187Z Downloading MarkupSafe-2.1.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB) 2022-11-23T02:58:34.6023281Z Using legacy 'setup.py install' for future, since package 'wheel' is not installed. 2022-11-23T02:58:34.7961921Z Installing collected packages: wheel, six, astunparse, expecttest, future, attrs, exceptiongroup, sortedcontainers, hypothesis, numpy, pyyaml, types-dataclasses, typing-extensions, mpmath, sympy, filelock, networkx, MarkupSafe, jinja2 2022-11-23T02:58:34.8248953Z WARNING: The script wheel is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-11-23T02:58:34.8249599Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-11-23T02:58:34.8684373Z Running setup.py install for future: started 2022-11-23T02:58:35.5320115Z Running setup.py install for future: finished with status 'done' 2022-11-23T02:58:35.8409886Z WARNING: The script hypothesis is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-11-23T02:58:35.8410576Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-11-23T02:58:37.8195341Z WARNING: The scripts f2py, f2py3 and f2py3.7 are installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-11-23T02:58:37.8196455Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-11-23T02:58:46.7606037Z WARNING: The script isympy is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2022-11-23T02:58:46.7606746Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2022-11-23T02:58:48.0019986Z Successfully installed MarkupSafe-2.1.1 astunparse-1.6.3 attrs-22.1.0 exceptiongroup-1.0.4 expecttest-0.1.4 filelock-3.8.0 future-0.18.2 hypothesis-6.58.0 jinja2-3.1.2 mpmath-1.2.1 networkx-2.6.3 numpy-1.21.6 pyyaml-6.0 six-1.16.0 sortedcontainers-2.4.0 sympy-1.10.1 types-dataclasses-0.6.6 typing-extensions-4.4.0 wheel-0.38.4 2022-11-23T02:58:48.0819636Z + python3 -m pip install boto3==1.19.12 2022-11-23T02:58:48.3703338Z Defaulting to user installation because normal site-packages is not writeable 2022-11-23T02:58:49.3179537Z Collecting boto3==1.19.12 2022-11-23T02:58:49.3349123Z Downloading boto3-1.19.12-py3-none-any.whl (131 kB) 2022-11-23T02:58:49.3965145Z Collecting s3transfer<0.6.0,>=0.5.0 2022-11-23T02:58:49.4006310Z Downloading s3transfer-0.5.2-py3-none-any.whl (79 kB) 2022-11-23T02:58:49.4483726Z Collecting jmespath<1.0.0,>=0.7.1 2022-11-23T02:58:49.4523820Z Downloading jmespath-0.10.0-py2.py3-none-any.whl (24 kB) 2022-11-23T02:58:50.5616512Z Collecting botocore<1.23.0,>=1.22.12 2022-11-23T02:58:50.5675431Z Downloading botocore-1.22.12-py3-none-any.whl (8.1 MB) 2022-11-23T02:58:50.8054647Z Collecting python-dateutil<3.0.0,>=2.1 2022-11-23T02:58:50.8096944Z Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB) 2022-11-23T02:58:50.8279397Z Requirement already satisfied: urllib3<1.27,>=1.25.4 in /home/ec2-user/.local/lib/python3.7/site-packages (from botocore<1.23.0,>=1.22.12->boto3==1.19.12) (1.26.12) 2022-11-23T02:58:50.8497643Z Requirement already satisfied: six>=1.5 in /home/ec2-user/.local/lib/python3.7/site-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.23.0,>=1.22.12->boto3==1.19.12) (1.16.0) 2022-11-23T02:58:51.0596998Z Installing collected packages: python-dateutil, jmespath, botocore, s3transfer, boto3 2022-11-23T02:58:51.9407854Z Successfully installed boto3-1.19.12 botocore-1.22.12 jmespath-0.10.0 python-dateutil-2.8.2 s3transfer-0.5.2 2022-11-23T02:58:51.9953777Z + python3 -m tools.stats.print_test_stats --upload-to-s3 --compare-with-s3 test 2022-11-23T02:59:07.3919189Z [scribe] Scribe access token not provided, sending report via boto3... 2022-11-23T02:59:07.3919494Z 2022-11-23T02:59:07.3923518Z ----- Historic stats comparison result ------ 2022-11-23T02:59:07.3923946Z 2022-11-23T02:59:07.3924445Z job: linux-bionic-cuda11.6-py3.10-gcc7 2022-11-23T02:59:07.3925165Z commit: 1cfd3858ac54fe3883534309081631a0a892ba3f 2022-11-23T02:59:07.3925449Z 2022-11-23T02:59:07.3929403Z Commit graph (base is most recent master ancestor with at least one S3 report): 2022-11-23T02:59:07.3929697Z 2022-11-23T02:59:07.3930089Z : (master) 2022-11-23T02:59:07.3930435Z | 2022-11-23T02:59:07.3930920Z * 1cfd3858ac (HEAD) total time 3047.67s 2022-11-23T02:59:07.3932667Z * 26322544b8 (base) 14 reports, total time 4545.22s ± 1821.39s 2022-11-23T02:59:07.3933146Z * 7f4b4d2827 14 reports, total time 4184.38s ± 2086.66s 2022-11-23T02:59:07.3933583Z * b50699f247 14 reports, total time 4471.77s ± 1861.64s 2022-11-23T02:59:07.3934021Z * 8bf8e4d71e 14 reports, total time 4570.74s ± 1842.55s 2022-11-23T02:59:07.3934440Z * ce856cee7e 14 reports, total time 4577.38s ± 1838.96s 2022-11-23T02:59:07.3935112Z * 391b593ca2 14 reports, total time 4457.06s ± 1822.27s 2022-11-23T02:59:07.3935627Z * 5bba783d21 14 reports, total time 4489.09s ± 1854.90s 2022-11-23T02:59:07.3936061Z * ea920a1115 14 reports, total time 4455.13s ± 1838.12s 2022-11-23T02:59:07.3936549Z * 74e62a1fef 14 reports, total time 4552.71s ± 1811.91s 2022-11-23T02:59:07.3936978Z * 00b7d8ef23 14 reports, total time 4180.04s ± 2065.84s 2022-11-23T02:59:07.3937275Z | 2022-11-23T02:59:07.3937467Z : 2022-11-23T02:59:07.3937613Z 2022-11-23T02:59:07.3937788Z Removed (across 1013 suites) 0 tests, totaling 0.00s 2022-11-23T02:59:07.3938149Z Modified (across 0 suites) 0 tests, totaling 0.00s 2022-11-23T02:59:07.3938489Z Added (across 54 suites) 789 tests, totaling +3882.14s 2022-11-23T02:59:07.4530285Z ##[group]Run pytorch/test-infra/.github/actions/teardown-linux@main 2022-11-23T02:59:07.4530621Z with: 2022-11-23T02:59:07.4530836Z env: 2022-11-23T02:59:07.4531075Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:59:07.4531329Z GPU_FLAG: --gpus all 2022-11-23T02:59:07.4531685Z DOCKER_CONTAINER_ID: 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:59:07.4532027Z ##[endgroup] 2022-11-23T02:59:07.4549679Z ##[group]Run set -eou pipefail 2022-11-23T02:59:07.4549996Z set -eou pipefail 2022-11-23T02:59:07.4550246Z  2022-11-23T02:59:07.4550574Z echo "Holding runner for 2 hours until all ssh sessions have logged out" 2022-11-23T02:59:07.4550902Z for _ in $(seq 1440); do 2022-11-23T02:59:07.4551204Z  # Break if no ssh session exists anymore 2022-11-23T02:59:07.4551502Z  if [ "$(who)" = "" ]; then 2022-11-23T02:59:07.4551736Z  break 2022-11-23T02:59:07.4552005Z  fi 2022-11-23T02:59:07.4552241Z  echo "." 2022-11-23T02:59:07.4552464Z  sleep 5 2022-11-23T02:59:07.4552696Z done 2022-11-23T02:59:07.4565865Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T02:59:07.4566149Z env: 2022-11-23T02:59:07.4566392Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:59:07.4566663Z GPU_FLAG: --gpus all 2022-11-23T02:59:07.4567007Z DOCKER_CONTAINER_ID: 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:59:07.4567361Z ##[endgroup] 2022-11-23T02:59:07.4596547Z Holding runner for 2 hours until all ssh sessions have logged out 2022-11-23T02:59:07.4693469Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2022-11-23T02:59:07.4693911Z # ignore expansion of "docker ps -q" since it could be empty 2022-11-23T02:59:07.4694255Z # shellcheck disable=SC2046 2022-11-23T02:59:07.4694565Z docker stop $(docker ps -q) || true 2022-11-23T02:59:07.4694871Z # Prune all of the docker images 2022-11-23T02:59:07.4695169Z docker system prune -af 2022-11-23T02:59:07.4707267Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2022-11-23T02:59:07.4707552Z env: 2022-11-23T02:59:07.4707793Z GIT_DEFAULT_BRANCH: master 2022-11-23T02:59:07.4708059Z GPU_FLAG: --gpus all 2022-11-23T02:59:07.4708401Z DOCKER_CONTAINER_ID: 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:59:07.4708753Z ##[endgroup] 2022-11-23T02:59:08.5487331Z 08317a7e7676 2022-11-23T02:59:09.6440787Z Deleted Containers: 2022-11-23T02:59:09.6441211Z 08317a7e76765ddf9994159dcbf30c1ed063f32bd07985e062710dc3e458ac35 2022-11-23T02:59:09.6441448Z 2022-11-23T02:59:14.8659148Z Deleted Images: 2022-11-23T02:59:14.8660523Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7:072aae4a77ed7d3a69ad5683420509c41301b940 2022-11-23T02:59:14.8661738Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-bionic-cuda11.6-cudnn8-py3-gcc7@sha256:3a5626edfb2c43fb24303351be75287af92426b6bb7c6df2defc98f980346c6a 2022-11-23T02:59:14.8662375Z deleted: sha256:e2c63e8434298b5b8922fe396fb22d541e83da3321f8559334df676354c6a90a 2022-11-23T02:59:14.8662804Z deleted: sha256:e97e2654456ae35786d9ff4a73ece4d85ce36ae9bd4e402e5f8c4c41a4b8cb5d 2022-11-23T02:59:14.8663265Z deleted: sha256:0191afefc9967131b7cd6196bee5a1d3a4eba8c24d3e11ff67013ecd0d244f4d 2022-11-23T02:59:14.8663709Z deleted: sha256:cd6998962c740e934e511d315fd0139a2737289173123cd7675b630fe71d0a6f 2022-11-23T02:59:14.8664126Z deleted: sha256:684c9dbbc4faf4388438a99012caaa6e9e9c3ac93f3842ff7b2f4c81c6c66866 2022-11-23T02:59:14.8664579Z deleted: sha256:be75865fc66b386df8a53dd220b7f4fa8464d0c86f06b6fa84e7d5b8fa2b5333 2022-11-23T02:59:14.8665208Z deleted: sha256:9e5281171ccc5aa329fd085f38d4831c13f47e27ea26a9243daf336fc701114a 2022-11-23T02:59:14.8666050Z deleted: sha256:0ba6072392ef0b01b99d45293e62f415e397460b4bf5a00257afb7aa9cfccb14 2022-11-23T02:59:14.8666710Z deleted: sha256:5f0fab79723550908a4149737ce5268ceacba20bc9c1aea35acdb6ff93ba4aa7 2022-11-23T02:59:14.8667145Z deleted: sha256:0c8088138816657b983280a5e4385f5c159c90b6be095bc4972290be20d46c16 2022-11-23T02:59:14.8667568Z deleted: sha256:a9cdd96267ff8adf28efa06db7d37977216a7580ca475239528fff85024f9bcb 2022-11-23T02:59:14.8668004Z deleted: sha256:9abd11e0b20ee19055f20e11ac5a4cc029eee3433686ce8ab9ffb6636269391a 2022-11-23T02:59:14.8668595Z deleted: sha256:cb16cc59b9c802a04fe3283c4a00840d0a3c24128b3620964a7aa927a757d672 2022-11-23T02:59:14.8669364Z deleted: sha256:ed27e40372acea88785f25bcd63f03a56960f00e444e3d5b22e52915e885242b 2022-11-23T02:59:14.8669861Z deleted: sha256:395dfa2cf9efd2fde511c14dbaf706e2efb3ab003af0cd725614b86f10643247 2022-11-23T02:59:14.8670322Z deleted: sha256:ca415181cb076083a9af8e85b901ee24154183e2d4c3960e21aab48260376214 2022-11-23T02:59:14.8670766Z deleted: sha256:b13fc2861b47406c24208813cb5398b911d9bae952f11ed9a411f42e221f8dfc 2022-11-23T02:59:14.8671238Z deleted: sha256:9cbf0b121bab50c1cad2d31b40f6c7c52003ba77877a2ef6d9bc87a2c0b073d2 2022-11-23T02:59:14.8671768Z deleted: sha256:60e157b04ecdbe2bce04795e0fade9ec9aae999065bd410785dcbaedd9778a19 2022-11-23T02:59:14.8672214Z deleted: sha256:5eb96691864f520823a417cd2f3278b4c2ac579490941d6c623865e478828c8b 2022-11-23T02:59:14.8672647Z deleted: sha256:e93d6940ac64ac73f178cc63066fb2c3ab041023d66146b32019cb7860511be5 2022-11-23T02:59:14.8673290Z deleted: sha256:e302e1f04c7e3031f83227f08d6987b02f39a75ac0e741754afad2dc1e265f8a 2022-11-23T02:59:14.8673720Z deleted: sha256:d82cdf793dbcd047c1843326443a1249721e7308a7c6fb3e23fe7331652e7047 2022-11-23T02:59:14.8674157Z deleted: sha256:3edb430c2f9009d4993daf017be01fe272bd3452db11c16e51f7755ac845d410 2022-11-23T02:59:14.8674595Z deleted: sha256:16e8f362c1784e16c1db6b1d0aa4449097e6d646f4c8682a122dea7c4da38aaf 2022-11-23T02:59:14.8675032Z deleted: sha256:7f58576cf19df9f3be9082f2c0ec2fc7010409b97ecb99bae66a10805d752f48 2022-11-23T02:59:14.8675445Z deleted: sha256:88688611a15825ecae20cd8c4032711d2351d2f954a9ebcd4c671b2bdb017df8 2022-11-23T02:59:14.8675879Z deleted: sha256:a46e0b74ccdcd4e2eb07727be3bc1a2c4236b1f88c65e64a50234e8a35932a80 2022-11-23T02:59:14.8676334Z deleted: sha256:b633962159aa14dfe94a149d00f90eecaba6dab960d4011bdf3667a5ee9586db 2022-11-23T02:59:14.8676760Z deleted: sha256:a05c7409499ce8c5d7ffc085772c3910c812ec835dc9145bbbb07b8b3c075235 2022-11-23T02:59:14.8677207Z deleted: sha256:0d63a7de5066f69cd9fd1af8fc47405e880de8f88f5cb16278a1f1ac94d0cd41 2022-11-23T02:59:14.8677637Z deleted: sha256:7d74b4ce1a60334100fccc0917345873714e640160622691b579d64c0ae4640f 2022-11-23T02:59:14.8678059Z deleted: sha256:33aae29ffc4791507bef289cfb1f178909f3fc97a40c618723eeec1f8f5bd80c 2022-11-23T02:59:14.8678492Z deleted: sha256:6ea72b84f0436ed1d288baf124dd38e43bbb89e746ccfd3a4ec420ddced8bbc2 2022-11-23T02:59:14.8678948Z deleted: sha256:04e33e1cfdd5a1b2409b80f5881e6cf7b1810fe975aad4ce7c97b0ff6c0e7b4e 2022-11-23T02:59:14.8679408Z deleted: sha256:df1ef30e86bf04681ecd0728263efe1e98b2eea0a228cef29bd0febfc8bdac2f 2022-11-23T02:59:14.8679838Z deleted: sha256:36a44974e500014175f5e49f50c8afa1ac9c5e8092a8ea99c3c97b7ce9c517d8 2022-11-23T02:59:14.8680266Z deleted: sha256:a31f0224d50d031823b07dbb97f256f6960c87ea3c52ebeceef98febab200451 2022-11-23T02:59:14.8680702Z deleted: sha256:c4beec84548d277aff0487a9a5a8c2b3d577421e3275f36106b778c6edbb9d53 2022-11-23T02:59:14.8681130Z deleted: sha256:bcc7df3b45729f5d1802045954e76e3407d9e07ba6f516de0895d775d00ad7f8 2022-11-23T02:59:14.8681555Z deleted: sha256:84de992a179a16ba619507ec45b04b4c0da3d3fa31cedc8f6beb5aaadd7a232a 2022-11-23T02:59:14.8681988Z deleted: sha256:5011206a0b2edc2a6c68ba41313e7f283ee7c925ab6a731f8818d01352f68596 2022-11-23T02:59:14.8682420Z deleted: sha256:46a56b12ac94daa35c90ac97d26adfde704693e34613d69fb97687aa53ae33f5 2022-11-23T02:59:14.8682848Z deleted: sha256:ed2b7a9e28b3474bc9b7e68f8158ecda88b3fa3d3ab1587898fa976922af0deb 2022-11-23T02:59:14.8683286Z deleted: sha256:4a6976746db7764bb48f2a06af1fb5f88e3646edc1c9bc0d18686d5a6350cac0 2022-11-23T02:59:14.8683903Z deleted: sha256:5e175425e3e9ec93e8c6c1b7560b49ef5e95af68ec55757902072a8dca020323 2022-11-23T02:59:14.8684347Z deleted: sha256:fb740502513c6cf883c844f03760de367c4c70d09a69b9476bcf737b4578563a 2022-11-23T02:59:14.8684751Z deleted: sha256:2c105119fc030d11b3d570ec9a83948a1fb17f138df2a3245f9566b89de51495 2022-11-23T02:59:14.8685197Z deleted: sha256:8caad6b6cba0d0ced7e21fe4b2027b8647d66b7f78c34367dd8571a0520ba2c0 2022-11-23T02:59:14.8685661Z deleted: sha256:1051db32aefad193995ca536ed99e29eed4fd0340ddda721ec11e9c4eb9e93af 2022-11-23T02:59:14.8686168Z deleted: sha256:c6b2a4553f41b3b4a3dc6a26be0020c98980bb4e7186d194901769dce6716c27 2022-11-23T02:59:14.8686618Z deleted: sha256:8faec3528fe75bb31f14d0caf8707a2fe4b70f60d7e631c2b3dbb36cd6d83dd9 2022-11-23T02:59:14.8687068Z deleted: sha256:7574bc80094251ac667e6bed9dd5a808ecf6f61f23c8d4c56a69c644d06f4e32 2022-11-23T02:59:14.8687500Z deleted: sha256:69f57fbceb1b420d7e4697e0f6514887b0805ee0059bea7d51e0a832962e74bf 2022-11-23T02:59:14.8687729Z 2022-11-23T02:59:14.8767283Z Total reclaimed space: 21.93GB 2022-11-23T02:59:14.8832874Z Post job cleanup. 2022-11-23T02:59:14.8870825Z Post job cleanup. 2022-11-23T02:59:15.0233610Z [command]/usr/bin/git version 2022-11-23T02:59:15.0288255Z git version 2.37.1 2022-11-23T02:59:15.0351890Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/2e7aaed1-6cc6-418b-9c74-de5b964a5132' before making global git config changes 2022-11-23T02:59:15.0352471Z Adding repository directory to the temporary git global config as a safe directory 2022-11-23T02:59:15.0358778Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2022-11-23T02:59:15.0402615Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2022-11-23T02:59:15.0439263Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || : 2022-11-23T02:59:15.0762976Z Entering 'android/libs/fbjni' 2022-11-23T02:59:15.0804643Z Entering 'third_party/FP16' 2022-11-23T02:59:15.0846315Z Entering 'third_party/FXdiv' 2022-11-23T02:59:15.0888332Z Entering 'third_party/NNPACK' 2022-11-23T02:59:15.0932694Z Entering 'third_party/QNNPACK' 2022-11-23T02:59:15.0974280Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T02:59:15.1015576Z Entering 'third_party/XNNPACK' 2022-11-23T02:59:15.1070674Z Entering 'third_party/benchmark' 2022-11-23T02:59:15.1111465Z Entering 'third_party/cpuinfo' 2022-11-23T02:59:15.1153525Z Entering 'third_party/cub' 2022-11-23T02:59:15.1195806Z Entering 'third_party/cudnn_frontend' 2022-11-23T02:59:15.1243011Z Entering 'third_party/cutlass' 2022-11-23T02:59:15.1292214Z Entering 'third_party/eigen' 2022-11-23T02:59:15.1336451Z Entering 'third_party/fbgemm' 2022-11-23T02:59:15.1377330Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T02:59:15.1418584Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T02:59:15.1460114Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T02:59:15.1500550Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T02:59:15.1543821Z Entering 'third_party/flatbuffers' 2022-11-23T02:59:15.1587241Z Entering 'third_party/fmt' 2022-11-23T02:59:15.1628221Z Entering 'third_party/foxi' 2022-11-23T02:59:15.1669815Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T02:59:15.1714254Z Entering 'third_party/gloo' 2022-11-23T02:59:15.1759406Z Entering 'third_party/googletest' 2022-11-23T02:59:15.1801679Z Entering 'third_party/ideep' 2022-11-23T02:59:15.1842119Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T02:59:15.1887168Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T02:59:15.1937422Z Entering 'third_party/ios-cmake' 2022-11-23T02:59:15.1980417Z Entering 'third_party/ittapi' 2022-11-23T02:59:15.2022931Z Entering 'third_party/kineto' 2022-11-23T02:59:15.2063958Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T02:59:15.2104884Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T02:59:15.2147518Z Entering 'third_party/nccl/nccl' 2022-11-23T02:59:15.2189284Z Entering 'third_party/neon2sse' 2022-11-23T02:59:15.2231040Z Entering 'third_party/nlohmann' 2022-11-23T02:59:15.2274034Z Entering 'third_party/onnx' 2022-11-23T02:59:15.2329062Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T02:59:15.2371258Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T02:59:15.2416461Z Entering 'third_party/onnx-tensorrt' 2022-11-23T02:59:15.2459031Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T02:59:15.2505791Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T02:59:15.2548622Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T02:59:15.2590687Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T02:59:15.2637767Z Entering 'third_party/pocketfft' 2022-11-23T02:59:15.2680569Z Entering 'third_party/protobuf' 2022-11-23T02:59:15.2725714Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T02:59:15.2768804Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T02:59:15.2812343Z Entering 'third_party/psimd' 2022-11-23T02:59:15.2853840Z Entering 'third_party/pthreadpool' 2022-11-23T02:59:15.2895247Z Entering 'third_party/pybind11' 2022-11-23T02:59:15.2938593Z Entering 'third_party/python-enum' 2022-11-23T02:59:15.2980605Z Entering 'third_party/python-peachpy' 2022-11-23T02:59:15.3022104Z Entering 'third_party/python-six' 2022-11-23T02:59:15.3063487Z Entering 'third_party/sleef' 2022-11-23T02:59:15.3105121Z Entering 'third_party/tbb' 2022-11-23T02:59:15.3149450Z Entering 'third_party/tensorpipe' 2022-11-23T02:59:15.3191646Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T02:59:15.3233159Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T02:59:15.3274597Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T02:59:15.3316396Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T02:59:15.3357985Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T02:59:15.3402689Z Entering 'third_party/zstd' 2022-11-23T02:59:15.3459858Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2022-11-23T02:59:15.3489771Z http.https://github.com/.extraheader 2022-11-23T02:59:15.3499183Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2022-11-23T02:59:15.3535781Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || : 2022-11-23T02:59:15.3852160Z Entering 'android/libs/fbjni' 2022-11-23T02:59:15.3877643Z http.https://github.com/.extraheader 2022-11-23T02:59:15.3911091Z Entering 'third_party/FP16' 2022-11-23T02:59:15.3935530Z http.https://github.com/.extraheader 2022-11-23T02:59:15.3969913Z Entering 'third_party/FXdiv' 2022-11-23T02:59:15.3995323Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4027093Z Entering 'third_party/NNPACK' 2022-11-23T02:59:15.4052549Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4085826Z Entering 'third_party/QNNPACK' 2022-11-23T02:59:15.4110259Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4143335Z Entering 'third_party/VulkanMemoryAllocator' 2022-11-23T02:59:15.4168726Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4202342Z Entering 'third_party/XNNPACK' 2022-11-23T02:59:15.4226506Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4271207Z Entering 'third_party/benchmark' 2022-11-23T02:59:15.4295187Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4328039Z Entering 'third_party/cpuinfo' 2022-11-23T02:59:15.4352968Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4385433Z Entering 'third_party/cub' 2022-11-23T02:59:15.4411025Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4443351Z Entering 'third_party/cudnn_frontend' 2022-11-23T02:59:15.4467674Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4505681Z Entering 'third_party/cutlass' 2022-11-23T02:59:15.4530825Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4571487Z Entering 'third_party/eigen' 2022-11-23T02:59:15.4595783Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4629667Z Entering 'third_party/fbgemm' 2022-11-23T02:59:15.4654721Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4687179Z Entering 'third_party/fbgemm/third_party/asmjit' 2022-11-23T02:59:15.4711518Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4743656Z Entering 'third_party/fbgemm/third_party/cpuinfo' 2022-11-23T02:59:15.4768949Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4802319Z Entering 'third_party/fbgemm/third_party/googletest' 2022-11-23T02:59:15.4825985Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4858300Z Entering 'third_party/fbgemm/third_party/hipify_torch' 2022-11-23T02:59:15.4882593Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4918068Z Entering 'third_party/flatbuffers' 2022-11-23T02:59:15.4942456Z http.https://github.com/.extraheader 2022-11-23T02:59:15.4976664Z Entering 'third_party/fmt' 2022-11-23T02:59:15.5001049Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5033349Z Entering 'third_party/foxi' 2022-11-23T02:59:15.5057329Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5089704Z Entering 'third_party/gemmlowp/gemmlowp' 2022-11-23T02:59:15.5114725Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5146530Z Entering 'third_party/gloo' 2022-11-23T02:59:15.5171031Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5202871Z Entering 'third_party/googletest' 2022-11-23T02:59:15.5226881Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5259336Z Entering 'third_party/ideep' 2022-11-23T02:59:15.5283988Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5315067Z Entering 'third_party/ideep/mkl-dnn' 2022-11-23T02:59:15.5338793Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5373415Z Entering 'third_party/ideep/mkl-dnn/third_party/oneDNN' 2022-11-23T02:59:15.5397559Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5437767Z Entering 'third_party/ios-cmake' 2022-11-23T02:59:15.5461557Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5493562Z Entering 'third_party/ittapi' 2022-11-23T02:59:15.5519604Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5551269Z Entering 'third_party/kineto' 2022-11-23T02:59:15.5574610Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5606738Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2022-11-23T02:59:15.5630277Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5663170Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2022-11-23T02:59:15.5687784Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5721625Z Entering 'third_party/nccl/nccl' 2022-11-23T02:59:15.5745576Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5778315Z Entering 'third_party/neon2sse' 2022-11-23T02:59:15.5804294Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5837824Z Entering 'third_party/nlohmann' 2022-11-23T02:59:15.5862098Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5896326Z Entering 'third_party/onnx' 2022-11-23T02:59:15.5921331Z http.https://github.com/.extraheader 2022-11-23T02:59:15.5966863Z Entering 'third_party/onnx/third_party/benchmark' 2022-11-23T02:59:15.5991606Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6024457Z Entering 'third_party/onnx/third_party/pybind11' 2022-11-23T02:59:15.6048883Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6083421Z Entering 'third_party/onnx-tensorrt' 2022-11-23T02:59:15.6107382Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6139480Z Entering 'third_party/onnx-tensorrt/third_party/onnx' 2022-11-23T02:59:15.6163764Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6201219Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/benchmark' 2022-11-23T02:59:15.6225034Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6258738Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11' 2022-11-23T02:59:15.6283838Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6315952Z Entering 'third_party/onnx-tensorrt/third_party/onnx/third_party/pybind11/tools/clang' 2022-11-23T02:59:15.6340720Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6378222Z Entering 'third_party/pocketfft' 2022-11-23T02:59:15.6404079Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6436309Z Entering 'third_party/protobuf' 2022-11-23T02:59:15.6461042Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6497169Z Entering 'third_party/protobuf/third_party/benchmark' 2022-11-23T02:59:15.6521267Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6553838Z Entering 'third_party/protobuf/third_party/googletest' 2022-11-23T02:59:15.6577508Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6611747Z Entering 'third_party/psimd' 2022-11-23T02:59:15.6638280Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6670271Z Entering 'third_party/pthreadpool' 2022-11-23T02:59:15.6694456Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6726914Z Entering 'third_party/pybind11' 2022-11-23T02:59:15.6751921Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6784068Z Entering 'third_party/python-enum' 2022-11-23T02:59:15.6809410Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6842064Z Entering 'third_party/python-peachpy' 2022-11-23T02:59:15.6866511Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6898139Z Entering 'third_party/python-six' 2022-11-23T02:59:15.6923445Z http.https://github.com/.extraheader 2022-11-23T02:59:15.6956427Z Entering 'third_party/sleef' 2022-11-23T02:59:15.6980805Z http.https://github.com/.extraheader 2022-11-23T02:59:15.7013744Z Entering 'third_party/tbb' 2022-11-23T02:59:15.7038274Z http.https://github.com/.extraheader 2022-11-23T02:59:15.7074085Z Entering 'third_party/tensorpipe' 2022-11-23T02:59:15.7098778Z http.https://github.com/.extraheader 2022-11-23T02:59:15.7131825Z Entering 'third_party/tensorpipe/third_party/googletest' 2022-11-23T02:59:15.7156076Z http.https://github.com/.extraheader 2022-11-23T02:59:15.7187725Z Entering 'third_party/tensorpipe/third_party/libnop' 2022-11-23T02:59:15.7211824Z http.https://github.com/.extraheader 2022-11-23T02:59:15.7243569Z Entering 'third_party/tensorpipe/third_party/libuv' 2022-11-23T02:59:15.7267620Z http.https://github.com/.extraheader 2022-11-23T02:59:15.7299740Z Entering 'third_party/tensorpipe/third_party/pybind11' 2022-11-23T02:59:15.7323719Z http.https://github.com/.extraheader 2022-11-23T02:59:15.7355473Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2022-11-23T02:59:15.7379300Z http.https://github.com/.extraheader 2022-11-23T02:59:15.7415313Z Entering 'third_party/zstd' 2022-11-23T02:59:15.7440410Z http.https://github.com/.extraheader 2022-11-23T02:59:15.7767813Z Cleaning up orphan processes